Machine Learning Operations Engineer
Posted 94ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
MLOps Engineer focused on full lifecycle of data and model pipelines for healthcare technology. Collaborating across teams to ensure reliable ML systems in production environments.
Responsibilities:
- Design, build, and maintain scalable data pipelines supporting model training, inference, batch processing, and real-time analytics workflows.
- Monitor production ML pipelines to identify anomalies, performance degradations, or failures related to data quality, logic defects, or infrastructure issues.
- Execute rapid troubleshooting and root-cause analysis followed by timely remediation, validation, and full regression testing prior to redeployment.
- Collaborate with Data Science, Engineering, and Product teams to operationalize machine learning models—including LLM-based and MCP-orchestrated systems—ensuring seamless integration into production environments.
- Develop CI/CD workflows, model deployment strategies, and automated testing frameworks to support reliable, repeatable releases.
- Implement and maintain observability tooling (logging, monitoring, alerting) to ensure high availability and traceability of ML systems.
- Manage and optimize cloud infrastructure across Azure and AWS for compute, storage, orchestration, and security needs.
- Create and maintain documentation, runbooks, and best practices for model operations and system maintenance.
- Perform all other job-related duties as assigned.
Requirements:
- Bachelor’s Degree in Computer Science, Engineering, Data Science, or equivalent work experience.
- 5–7 years of combined experience in Data Science, MLOps, Machine Learning Engineering, or related fields.
- Advanced proficiency in Python, Jupyter, and common ML/analytics frameworks.
- Hands-on experience with Snowflake or similar cloud data warehousing environments.
- Strong working knowledge of both Azure and AWS cloud platforms, including compute orchestration, networking, and security best practices.
- Demonstrated experience operationalizing traditional ML models as well as LLM-based and MCP-orchestrated systems.
- Experience with CI/CD tools, containerization (Docker), infrastructure-as-code, and ML pipeline frameworks.
- Strong ability to diagnose and resolve pipeline failures, data anomalies, and complex system issues.
- Excellent problem-solving skills, attention to detail, and a proactive, self-directed work ethic.
- Strong communication skills and comfort working in fast-paced, cross-functional environments.
Benefits:
- Health insurance
- 401(k) plan
- Professional development opportunities




















