Site Reliability Engineer
Posted 85ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Site Reliability Engineer ensuring availability and performance of customer-facing platform. Collaborating closely with DevOps, DBA, and Development teams to provision and maintain infrastructure.
Responsibilities:
- Manage, monitor, and maintain highly available systems (Windows and Linux)
- Analyze metrics and trends to ensure rapid scalability.
- Address routine service requests while identifying ways to automate and simplify.
- Create infrastructure as code using Terraform, ARM Templates, Cloud Formation.
- Maintain data backups and disaster recovery plans.
- Design and deploy CI/CD pipelines using GitHub Actions, Octopus, Ansible, Jenkins, Azure DevOps.
- Adhere to security best practices through all stages of the software development lifecycle
- Follow and champion ITIL best practices and standards.
- Become a resource for emerging and existing cloud technologies with a focus on AWS.
Requirements:
- 5+ years of experience in SRE or System Administration role
- Demonstrated ability building and supporting high availability Windows/Linux servers, with emphasis on the WISA stack (Windows/IIS/SQL Server/ASP.net)
- 3+ years of experience with CI/CD tools
- 3+ years of experience working with cloud technologies including AWS, Azure.
- 1+ years of experience working with container technology including Docker and Kubernetes.
- Comfortable using Scrum, Kanban, or Lean methodologies.
- Hands-on experience with AWS is a must-have.
- Proficiency analyzing application, IIS, system, security logs and CloudTrail events.
- Practical experience with CI/CD tools such as GitHub Actions, Jenkins, Octopus.
- Experience with observability tools such as New Relic, Application Insights, AppDynamics, or DataDog.
- Experience maintaining and administering Windows, Linux, and Kubernetes.
- Experience in automation using scripting languages such as Bash, PowerShell, or Python.
- Configuration management experience using Ansible, Terraform, Azure Automation Run book or similar.
- Experience with SQL Server database maintenance and administration is preferred.
- Good Understanding of networking (VNET, subnet, private link, VNET peering).
















