Senior Sire Reliability Engineer

Posted 67ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Senior Site Reliability Engineer improving reliability across production SaaS systems at CertifID, ensuring secure transactions and high performance.

Responsibilities:

  • Own and improve the reliability, availability, and performance of production systems while defining and operationalizing SLIs/SLOs and error budgets.
  • Design and implement autonomous and semi-autonomous AI agents for monitoring distributed systems and applications. Build agents capable of consuming multi-source observability data (metrics, logs, traces, etc.).
  • Participate in and help lead an on-call rotation, serving as an escalation point for major incidents and facilitating blameless postmortems.
  • Build automated workflows to eliminate manual work and design/maintain Infrastructure-as-Code with Terraform.
  • Improve metrics, logs, traces, and alerting using tools like Datadog or Prometheus to reduce noise and increase signal.
  • Partner with application teams to implement reliability best practices and mentor junior engineers to foster a culture of knowledge sharing.

Requirements:

  • 5+ years in SRE, DevOps, Platform Engineering, or Infrastructure Engineering.
  • Proven experience supporting production SaaS systems in Azure (preferred), AWS, or GCP.
  • Strong Linux, networking, and distributed systems troubleshooting skills.
  • Strong experience with containers and orchestration (Kubernetes/EKS/AKS).
  • Expertise with Infrastructure-as-Code (Terraform strongly preferred).
  • Strong scripting/programming skills in Python, Go, Bash, or C#/.NET.
  • Hands-on experience with Datadog, Prometheus/Grafana, or OpenTelemetry.

Benefits:

  • Flexible vacation
  • 12 company-paid holidays
  • 10 paid sick days
  • No work on your birthday
  • Health, dental, and vision Insurance (including a $0 option)
  • 401(k) with matching, and no waiting period
  • Equity
  • Life insurance
  • Generous parental paid leave
  • Wellness reimbursement of $300/year
  • Remote worker reimbursement of $300/year
  • Professional development reimbursement
  • Competitive pay
  • An award-winning culture