Senior Site Reliability Engineer

Posted 2hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Senior Site Reliability Engineer managing infrastructure for Payward Services at Kraken. Ensure performant, resilient, and continuously improving services with global teams.

Responsibilities:

  • Manage and support infrastructure for Payward Services, including Nomad, Kubernetes, databases, and 3rd party system integration
  • Provide operational support across multiple teams, helping debug issues in staging and production environments
  • Participate in incident response and post-incident reviews to improve system resilience
  • Consult with teams on performance, monitoring, and alerting best practices — with awareness of partner-facing SLA commitments
  • Build tooling, automation, and dashboards to improve observability and empower development teams
  • Maintain and troubleshoot CI pipelines, ensuring reliable and fast build, test, and deployment cycles
  • Collaborate with developers, QA, and product managers to streamline development and release cycles
  • Support a fully distributed team operating across multiple timezones

Requirements:

  • 5+ years in DevOps or SRE role
  • Proficiency with hybrid-cloud infrastructure environments
  • Git source version-control and CI/CD configuration proficiency
  • Deep understanding of monitoring and alerting systems, preferably Prometheus and Grafana
  • Ability to debug complex distributed systems, networks, and Linux operating systems issues
  • Containerization and orchestration experience (Docker, Nomad, Kubernetes a plus)
  • Strong scripting skills (Bash, Python, or Go)
  • Self-starter capable of thriving independently and remotely in fast-paced environments

Benefits:

  • Offers Equity
  • Offers Bonus