Senior Site Reliability Engineer

Posted 22hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Senior Site Reliability Engineer at Order.co to ensure reliable and scalable software systems. Collaborate with the Platform team while maintaining operational efficiency and infrastructure excellence.

Responsibilities:

  • Ensure software systems are reliable, scalable, performant, and operationally efficient
  • Design, build, and operate highly available, scalable, and fault-tolerant infrastructure and platform services
  • Define and maintain service level objectives (SLOs), service level indicators (SLIs), and error budgets across platform systems
  • Lead incident response efforts for complex production outages; drive root-cause analysis and long-term remediation actions
  • Develop infrastructure automation and self-service tooling to reduce operational toil and improve engineering velocity
  • Build and maintain CI/CD pipelines, deployment automation, and release engineering workflows
  • Design and maintain comprehensive monitoring, logging, tracing, and alerting systems for distributed services

Requirements:

  • Strong foundation in computer science fundamentals: data structures, algorithms, and system design
  • Familiarity with building production-grade applications and services using Ruby and Ruby on Rails
  • Deep expertise with Linux systems administration and production troubleshooting
  • Strong experience operating cloud infrastructure at scale, particularly within AWS environments
  • Experience with Kubernetes, container orchestration, and cloud-native infrastructure patterns
  • Proficiency with infrastructure as code tools such as Terraform or CloudFormation
  • Expertise designing and operating CI/CD pipelines and deployment automation systems
  • Deep understanding of observability tooling including Datadog, OpenTelemetry, or similar platforms
  • Strong knowledge of distributed systems reliability patterns including redundancy, failover, autoscaling, rate limiting, and graceful degradation
  • Experience supporting distributed microservices architectures and event-driven systems

Benefits:

  • Competitive compensation including base salary, bonus, and equity
  • Employer-sponsored 401(k) with match
  • Comprehensive medical, dental, and vision coverage
  • Flexible time off and hybrid work environment