AI Infrastructure & Platform Operations Engineer
Posted 2hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
AI Infrastructure & Platform Operations Engineer for Mirantis enabling organizations with scalable AI infrastructure. Supporting NVIDIA GPU platforms and collaborating on operational stability across environments.
Responsibilities:
- Monitor, operate, and support production AI infrastructure platforms.
- Investigate and resolve infrastructure, networking, hardware, and platform-related incidents.
- Support NVIDIA GPU infrastructure and associated platform services.
- Monitor and troubleshoot Kubernetes-based environments.
- Investigate performance, availability, and reliability issues across infrastructure and platform components.
- Collaborate with engineering teams, hardware vendors, datacenter personnel, and service delivery teams to resolve technical issues.
- Participate in incident response, root cause analysis, and operational improvement activities.
- Contribute to improvements in monitoring, observability, automation, and operational processes.
- Maintain operational documentation, runbooks, and knowledge articles.
Requirements:
- 3+ years of experience in infrastructure operations, platform operations, network operations, site reliability engineering, cloud operations, datacenter operations, or related technical roles.
- Strong Linux administration and troubleshooting skills.
- Good understanding of networking concepts and experience diagnosing infrastructure-related issues.
- Working knowledge of Kubernetes in production environments.
- Experience supporting production infrastructure and services.
- Strong analytical and problem-solving skills.
- Experience working within structured operational and incident management processes.
- Excellent communication and collaboration skills.
- Ability to work within a shift-based operational environment.
Benefits:
- Work with some of the most advanced AI infrastructure environments in production today.
- Gain exposure to NVIDIA GPU technologies, Kubernetes platforms, and high-performance networking environments.
- Help define how next-generation AI infrastructure is operated and supported.
- Be part of a team shaping the future of AI-powered operations through k0rdent AI.
- Join a growing organisation investing heavily in AI infrastructure and platform services.



















