Senior Site Reliability Engineer
Posted 15hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Site Reliability Engineer ensuring performance and scalability of Runlayer’s AI infrastructure. Collaborating with founders and engineers in a fast-paced environment to support cloud and on-prem setups.
Responsibilities:
- Own reliability and performance of our cloud infrastructure across AWS (ECS, Aurora, CloudWatch) and GCP
- Manage and optimize Kubernetes clusters and container orchestration
- Drive database reliability engineering, including performance tuning and scaling
- Build and maintain CI/CD pipelines for rapid, safe deployments
- Run incident response and on-call rotations
- Partner with product engineers to design scalable, resilient systems
Requirements:
- Strong AWS experience, particularly ECS, Aurora, and CloudWatch
- GCP experience as we expand cross-cloud
- Kubernetes and container orchestration expertise
- DBRE experience with database performance tuning
- CI/CD pipeline ownership and incident response experience
- Background at a B2B SaaS company serving enterprise customers, ideally in infrastructure
- Bonus Qualifications: Experience deploying and supporting on-prem or hybrid environments, Python backend familiarity (our platform is Python-based), Experience at an early-stage or high-growth company
Benefits:
- Competitive salary and equity — compensation that reflects your expertise and customer-facing responsibilities.
- Paid time off — 4 weeks paid vacation, paid sick leave, and paid parental leave.
- Professional development — budget for conferences, courses, and certifications in AI, enterprise software, and customer success.
- Top-tier equipment — your choice of laptop and accessories to create your ideal work environment.
- Health benefits — comprehensive health, dental, and vision coverage.
- Customer interaction opportunities — work directly with innovative companies and see the immediate impact of your work.




















