AI Data Engineering Lead
Posted 18hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
AI Data Engineering Lead building data foundations for AI models at Ideagen. Leading a team to ensure data quality, compliance, and operational efficiency.
Responsibilities:
- Leading and developing a team of AI data engineers, setting clear technical standards, supporting career growth, and scaling the function as the programme grows
- Defining the technical direction for AI data engineering, including architecture decisions, tooling choices, and delivery practices across the organisation
- Designing and building the end‑to‑end AI data platform, from operational product data and regulatory sources through cloud storage and transformation pipelines to training‑ready datasets
- Owning dataset versioning and lineage so every training artefact is traceable, reproducible, and auditable across the full model lifecycle
- Building and maintaining large‑scale regulatory and operational corpora in collaboration with domain experts, ensuring data quality and consistency
- Architecting and operating AWS‑based data infrastructure at production scale with a focus on reliability, security, and performance
- Defining and enforcing data governance standards, including quality checks, labelling conventions, and data handling frameworks
- Leading GDPR compliance for AI training data in partnership with Legal and ensuring best practice is embedded from the start
Requirements:
- You are a senior data engineer or technical lead with prior experience leading teams and owning large data platforms end to end
- You have deep production experience with Python and SQL and write data transformation code that is robust, readable, and reusable
- You have designed and run AWS data stacks at scale, including services such as S3, Glue, Athena, Kinesis, Lambda, and IAM
- You understand ML training data pipelines and know how they differ from analytics workloads, including dataset formats, splits, and quality constraints
- You bring strong data governance instincts and design for versioning, lineage, and auditability from day one
- You are comfortable working with legal and compliance partners on sensitive data handling and regulatory requirements
- You communicate clearly across disciplines and work effectively with AI engineers, product leaders, and domain specialists
- Experience with NLP or LLM training data, data version control tools, or regulated industry software is valuable but not essential.
Benefits:
- Benefits at Ideagen
















