Senior Site Reliability Engineer
Posted 123ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Senior Site Reliability Engineer optimizing healthcare systems at DexCare. Focused on building scalable, secure systems and improving automation.
Responsibilities:
- Design, scale, and operate resilient, cloud-native infrastructure in AWS — with a strong emphasis on EKS, IAM, RBAC, and modern security-first practices.
- Build and optimize CI/CD pipelines with GitHub Actions and GitHub Advanced Security — enabling velocity without compromising safety.
- Own observability across the stack using Datadog (metrics, logging, alerting, and tracing).
- Write and maintain Terragrunt, Terraform modules and infrastructure-as-code automation.
- Develop internal tools and scripts in Python to automate operational workflows and reduce manual overhead.
- Document everything — from runbooks to standards — so teams stay aligned, and systems stay stable.
- Actively contribute to Agile workflows using Jira, with clear tracking of work, priorities, and progress.
- Participate in on-call rotations, postmortems, and continuous improvement efforts — always with a blameless, team-first mindset.
Requirements:
- 4+ years in a Senior SRE or DevOps role supporting production cloud infrastructure at scale.
- Deep experience with AWS (IAM, EKS, VPC, EC2, Secrets Manager, Serverless) and RBAC.
- Hands-on proficiency with Terraform, Terragrunt, Helm, and container orchestration.
- Proven experience building and maintaining GitHub Actions for CI/CD, including GitHub Advanced Security features like secret scanning and code policy enforcement.
- Strong Datadog experience — building dashboards, tuning alerts, setting up monitors, and interpreting telemetry.
- Solid Python scripting experience for automation and internal tools.
- You value clear, accurate documentation as a core part of engineering, not an afterthought.
- Comfortable working in Agile/Scrum environments with well-tracked Jira workflows.
- Practical experience with resource analysis and infrastructure optimization.
Benefits:
- Eligible for Annual Bonus
- Healthcare benefits, short/long-term disability coverage, life - insurance, and 401k
- Paid Parental Leave
- Nine paid holidays & Unlimited PTO


















