Software Engineer – Site Reliability Engineer

Posted 77ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Site Reliability Engineer managing production systems and ensuring availability for Alkira's Cloud Networking platform. Working with innovative technologies and 24x7 infrastructure support as part of a dynamic engineering team.

Responsibilities:

  • You will be responsible for the availability and integrity of the infrastructure that underpins Alkira’s Cloud Networking platform
  • You hold the production systems together; troubleshoot issues that arise in production deployment
  • Provide 24x7 coverage as a part of scheduled shift and on-call rotation
  • Work with multiple tools like Prometheus, Grafana, Jira etc. to monitor, manage, triage and document infrastructure issues in real time
  • Automate infrastructure deployment using CI/CD
  • Build necessary tools to evolve how we maintain and monitor our solution
  • Develop and execute system and integration test plans

Requirements:

  • At least 2 years’ experience in management of production systems
  • Self starter and a solution oriented mindset. You see potential challenges as opportunities to learn and grow
  • Experience with cloud providers, AWS, Azure or GCP
  • Experience with computer networking and network technologies
  • Experience with CI/CD pipelines such as Concourse-CI, Jenkins.
  • Experience with Kubernetes
  • Excellent problem-solving skills and ability to quickly grasp new concepts
  • Highly desirable candidates with Hashicorp Certified: Terraform Associate

Benefits:

  • Health insurance
  • Professional development opportunities