Senior Site Reliability Engineer

Posted 6hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

SRE at Grupo SysMap managing cloud infrastructure and enhancing system reliability. Focused on automation, monitoring, and team collaboration.

Responsibilities:

  • Experience with cloud computing (AWS, GCP, OCI and/or Azure).
  • Strong knowledge of Linux and systems administration.
  • Experience with containerization, Kubernetes (k8s) and Helm.
  • Knowledge of infrastructure as code (e.g., Terraform / Terragrunt / CloudFormation).
  • Experience with CI/CD tools (e.g., Jenkins, GitHub Actions, Argo CD).
  • Experience with web servers (Apache, Nginx).
  • Scripting knowledge (Shell and/or Python).
  • Cloud certifications (AWS, Azure, OCI and/or GCP).
  • Knowledge of Grafana.
  • Previous experience in high-scale environments.

Requirements:

  • Ensure high availability, resilience and performance of production and non-production environments.
  • Administer, evolve and automate cloud infrastructure (AWS, GCP, OCI and Azure), ensuring best practices for cost, security and scalability.
  • Design, implement and manage containerized environments using Kubernetes.
  • Develop, version and maintain infrastructure as code (IaC) using tools like Terraform, CloudFormation or similar.
  • Analyze, diagnose and resolve complex incidents (troubleshooting), addressing root causes and preventing recurrence.
  • Implement and improve monitoring, observability and alerting strategies, proposing continuous improvements to system reliability.
  • Collaborate with development teams to adopt SRE, DevOps and reliability engineering practices, promoting a culture of automation and quality.
  • Define and monitor KPIs/SLOs/SLIs, ensuring alignment with business objectives.
  • Propose and lead continuous improvement initiatives, process automation and reduction of operational failures.

Benefits:

  • Experience with cloud computing (AWS, GCP, OCI and/or Azure).
  • Strong knowledge of Linux and systems administration.
  • Experience with containerization, Kubernetes (k8s) and Helm.
  • Knowledge of infrastructure as code (e.g., Terraform / Terragrunt / CloudFormation).
  • Experience with CI/CD tools (e.g., Jenkins, GitHub Actions, Argo CD).
  • Experience with web servers (Apache, Nginx).
  • Scripting knowledge (Shell and/or Python).
  • Cloud certifications (AWS, Azure, OCI and/or GCP).
  • Knowledge of Grafana.
  • Previous experience in high-scale environments.