Site Reliability Engineer

Posted 85ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Site Reliability Engineer ensuring availability and performance of customer-facing platform. Collaborating closely with DevOps, DBA, and Development teams to provision and maintain infrastructure.

Responsibilities:

  • Manage, monitor, and maintain highly available systems (Windows and Linux)
  • Analyze metrics and trends to ensure rapid scalability.
  • Address routine service requests while identifying ways to automate and simplify.
  • Create infrastructure as code using Terraform, ARM Templates, Cloud Formation.
  • Maintain data backups and disaster recovery plans.
  • Design and deploy CI/CD pipelines using GitHub Actions, Octopus, Ansible, Jenkins, Azure DevOps.
  • Adhere to security best practices through all stages of the software development lifecycle
  • Follow and champion ITIL best practices and standards.
  • Become a resource for emerging and existing cloud technologies with a focus on AWS.

Requirements:

  • 5+ years of experience in SRE or System Administration role
  • Demonstrated ability building and supporting high availability Windows/Linux servers, with emphasis on the WISA stack (Windows/IIS/SQL Server/ASP.net)
  • 3+ years of experience with CI/CD tools
  • 3+ years of experience working with cloud technologies including AWS, Azure.
  • 1+ years of experience working with container technology including Docker and Kubernetes.
  • Comfortable using Scrum, Kanban, or Lean methodologies.
  • Hands-on experience with AWS is a must-have.
  • Proficiency analyzing application, IIS, system, security logs and CloudTrail events.
  • Practical experience with CI/CD tools such as GitHub Actions, Jenkins, Octopus.
  • Experience with observability tools such as New Relic, Application Insights, AppDynamics, or DataDog.
  • Experience maintaining and administering Windows, Linux, and Kubernetes.
  • Experience in automation using scripting languages such as Bash, PowerShell, or Python.
  • Configuration management experience using Ansible, Terraform, Azure Automation Run book or similar.
  • Experience with SQL Server database maintenance and administration is preferred.
  • Good Understanding of networking (VNET, subnet, private link, VNET peering).