AI Platform Engineer

Posted 105ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

AI Platform Engineer optimizing AI/ML infrastructure and workflows at AHEAD. Collaborating with data scientists to build scalable environments for ML models.

Responsibilities:

  • Architect and manage Kubernetes clusters tailored to AI/ML workloads.
  • Implement Run:ai and operators for GPU resource orchestration and workload scheduling.
  • Develop and maintain Python-based automation scripts and ML pipelines; automate infrastructure provisioning with Terraform and configuration management with Ansible.
  • Create and manage Jupyter Notebooks for experimentation and collaboration.
  • Integrate and optimize NVIDIA Enterprise Suite components (CUDA, NeMo Framework, Triton, TensorRT, GPU drivers) for accelerated computing.
  • Establish and maintain MLOps best practices for model lifecycle management, CI/CD, and monitoring (e.g., MLflow, Kubeflow).
  • Work closely with data scientists and platform engineers to ensure efficient resource utilization and scalability across environments.

Requirements:

  • 4+ years in platform architecture or solutions architecture, with 2+ years focused on AI/ML workloads.
  • Experience with high-performance computing (HPC) environments.
  • Familiarity with distributed training and model optimization techniques.
  • Certification in Kubernetes or cloud platforms (AWS, Azure, GCP).
  • Strong proficiency in Python and experience with ML frameworks (TensorFlow, PyTorch).
  • Hands-on experience with Kubernetes and container orchestration.
  • Familiarity with Run:ai or similar GPU scheduling platforms.
  • Expertise in Terraform and Ansible for infrastructure automation.
  • Experience with Jupyter Notebooks for ML development.
  • Knowledge of NVIDIA Enterprise Suite (CUDA, NeMo Framework, Triton, GPU drivers).
  • Solid understanding of MLOps principles and tools (e.g., MLflow, Kubeflow).
  • Background in deploying and scaling AI workloads in cloud or hybrid environments.

Benefits:

  • India Employment Benefits include: