CD Operations Engineer
Posted 53mins ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Site Reliability Engineer managing and scaling a production Kubernetes platform for innovative companies. Focusing on automation, CI/CD pipelines, and operational excellence.
Responsibilities:
- Maintain and optimise CI/CD pipelines to ensure deployment readiness and validate all deployment artifacts from an operational perspective.
- Define and enforce quality assurance measures, including standard operating procedures and successful test reporting.
- Implement rollback strategies and comprehensive operational monitoring for all production deployments.
- Manage monitoring, incident, problem, and change management within a multi-tenant managed Kubernetes environment.
- Monitor system health, performance metrics, and service availability, resolving incidents to minimise service disruption.
- Perform root cause analysis and implement corrective and preventive actions to enhance platform stability.
- Automate recurring operational tasks and critical processes to reduce toil and improve service reliability.
- Validate automated procedures through the full software development lifecycle, including staging and testing.
- Implement logging and monitoring strategies to adhere to security and audit compliance standards.
- Conduct routine security scans and remediate vulnerabilities across the platform.
Requirements:
- Professional proficiency in both English and German (C1 level minimum)
- At least 3 years of hands-on operational experience with self-managed Kubernetes clusters and productive applications in on-premise environments
- Deep understanding of networking concepts, including protocols, load balancing, and security
- Extensive experience with CI/CD processes and tooling, such as GitLab, Jenkins, Tekton, or ArgoCD
- Fundamental understanding of core operations processes including incident, change, and problem management (ITSM) alongside SRE concepts
- Experience gathering operational insights from monitoring and observability tools, including managing SLI/SLA/SLOs
- Proven ability to document procedures and enforce clear runbooks or playbooks
- Practical experience with monitoring and logging stacks such as Prometheus, Grafana, Mimir, or Loki
Benefits:
- Flexible working hours
- Freedom to choose your own projects
- Access to exciting projects in various industries
- Competitive pay
- Dedicated team support




















