Senior MLOps Engineer

Posted 4ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Engenheiro(a) de MLOps em Campinas trabalhando com GenAI e agentes em produção. Executando operações padrões e monitorando sistemas de IA para garantir confiabilidade e performance.

Responsibilities:

  • Execute and follow established Standard Operating Procedures (SOPs) for GenAI and agent-based solutions in production
  • Monitor platform health, model performance, and inference pipelines
  • Ensure stability and availability of AI services across all environments
  • Investigate and resolve incidents by analyzing logs, traces, and metrics
  • Conduct root cause analysis (RCA) and document findings
  • Use observability tools (logs, metrics, tracing) to detect anomalies and performance issues
  • Contribute to the evolution of Standard Operating Procedures (SOPs) and runbooks
  • Support runtime operations of LLM-based applications and agent-driven workflows
  • Monitor inference performance (latency, throughput, cost)

Requirements:

  • Experience with MLOps, ML systems, or AI platform operations
  • Strong troubleshooting skills using logs and observability tools
  • Familiarity with cloud environments (e.g., Azure, AWS, GCP)
  • Understanding of ML pipelines, APIs, and distributed systems
  • Experience with monitoring tools (e.g., Datadog, Prometheus, Grafana, Azure Monitor)

Benefits:

  • Health insurance
  • Flexible working hours
  • Professional development opportunities