Lead AI Engineer
Posted 45ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Lead AI Engineer at Zeta Global optimizing large language models and driving SFT processes. Engage in developing advanced marketing technology solutions leveraging AI capabilities.
Responsibilities:
- Lead Supervised Fine-Tuning (SFT) of large language models in production, shaping instruction-following, reasoning quality, tone, and domain-specific behavior
- Extend SFT pipelines with instruction tuning and preference-based optimization (e.g., RLHF-style approaches or direct preference optimization)
- Design, curate, and maintain high-quality SFT and preference datasets, combining human-labeled and synthetic data tailored to real-world marketing and decisioning use cases
- Own model evaluation and benchmarking, including:
- Offline behavioral evals (instruction adherence, reasoning depth, hallucination rates)
- Online experiments and A/B tests
- Continuous regression detection and performance monitoring
- Develop and operate agentic LLM systems, enabling multi-step reasoning, tool use, workflow orchestration, and decision execution
- Implement and optimize prompting, retrieval-augmented generation (RAG), memory, and tool-calling strategies, with a clear understanding of when to solve problems via SFT versus prompting
- Partner closely with data engineering, platform, and product teams to integrate fine-tuned models into high-throughput, low-latency systems
- Establish best practices for LLM versioning, experimentation, deployment, rollback, governance, and safety
- Provide technical leadership and mentorship to engineers working on applied AI and LLM systems.
Requirements:
- Significant hands-on experience with Supervised Fine-Tuning (SFT) of LLMs in production, beyond prompt-only approaches
- Direct experience using OpenAI APIs and/or AWS Bedrock for SFT, post-training, and deployment
- Strong understanding of LLM post-training workflows, including data preparation, instruction tuning, evaluation methodologies, and common failure modes
- Experience building and operating agentic LLM systems (tool use, multi-step reasoning, workflow orchestration)
- Proficiency in Python and modern ML frameworks (e.g., PyTorch)
- Experience operating ML systems in distributed, production environments
- Strong intuition for trade-offs between model quality, latency, cost, safety, and scalability.
Benefits:
- Unlimited PTO
- Excellent medical, dental, and vision coverage
- Employee Equity
- Employee Discounts, Virtual Wellness Classes, and Pet Insurance And more!!
















