AI Infrastructure & Platform Operations Engineer

Posted 2hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

AI Infrastructure & Platform Operations Engineer for Mirantis enabling organizations with scalable AI infrastructure. Supporting NVIDIA GPU platforms and collaborating on operational stability across environments.

Responsibilities:

  • Monitor, operate, and support production AI infrastructure platforms.
  • Investigate and resolve infrastructure, networking, hardware, and platform-related incidents.
  • Support NVIDIA GPU infrastructure and associated platform services.
  • Monitor and troubleshoot Kubernetes-based environments.
  • Investigate performance, availability, and reliability issues across infrastructure and platform components.
  • Collaborate with engineering teams, hardware vendors, datacenter personnel, and service delivery teams to resolve technical issues.
  • Participate in incident response, root cause analysis, and operational improvement activities.
  • Contribute to improvements in monitoring, observability, automation, and operational processes.
  • Maintain operational documentation, runbooks, and knowledge articles.

Requirements:

  • 3+ years of experience in infrastructure operations, platform operations, network operations, site reliability engineering, cloud operations, datacenter operations, or related technical roles.
  • Strong Linux administration and troubleshooting skills.
  • Good understanding of networking concepts and experience diagnosing infrastructure-related issues.
  • Working knowledge of Kubernetes in production environments.
  • Experience supporting production infrastructure and services.
  • Strong analytical and problem-solving skills.
  • Experience working within structured operational and incident management processes.
  • Excellent communication and collaboration skills.
  • Ability to work within a shift-based operational environment.

Benefits:

  • Work with some of the most advanced AI infrastructure environments in production today.
  • Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments.
  • Help define how next-generation AI infrastructure is operated and supported.
  • Be part of a team shaping the future of AI-powered operations through k0rdent AI.
  • Join a growing organisation investing heavily in AI infrastructure and platform services.