Senior Cloud Platform Engineer
Posted 1ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Senior Cloud Platform Engineer at Mapbox delivering cloud-native infrastructure solutions. Responsible for managing AWS services, CI/CD tools, and ensuring system reliability.
Responsibilities:
- Actively onboard AWS resources to the declarative gitops-based framework utilizing Terraform and Terragrunt.
- Maintain and troubleshoot legacy cloud infrastructure in AWS that is deployed with Cloudformation/CDK and utilizes ECS, Lambda, EMR, etc.
- Architect and promote Kubernetes deployments for new services.
- Lead migration of deployment pipelines from ECS and Cloudformation to EKS and ArgoCD.
- Architect a centralized CI pipelines framework utilizing GitHub Actions and Runs-on.
- Broadly influence and lead the Mapbox Cloud Platform strategy around AWS architecture, open-source tools and frameworks.
- Configure and maintain a comprehensive observability platform, such as Datadog or Observe, to enable real-time monitoring, alerting, and analytics.
- Promote a culture of operational excellence by testing and monitoring our systems and code, and providing on-call support for the platform services.
- Document your work and decision-making processes, and lead presentations and discussions in a way that is easy for others to understand.
- Uphold a culture of collaboration, transparency, creativity, inclusion, and data-driven decisions.
Requirements:
- 5+ years experience leveraging infrastructure-as-code frameworks to manage AWS infrastructure using Terraform, Terragrunt, Atlantis, CDK
- 4+ years experience orchestrating containerized workloads at scale using EKS, ECS
- 4+ years experience managing scalable CI/CD frameworks in a distributed engineering organization using Github Actions
- Strong expertise with Kubernetes, ArgoCD, Istio
- Proven ability to design and develop cost efficient, secure, and durable solutions on AWS using EKS, ECS, EC2, Lambda, Fargate, CloudFront, IAM, Route53, DynamoDB
- Proficient in at least one programming language, such as Python, Nodejs, GoLang
- Experience configuring and managing observability systems in a distributed large-scale environment using Datadog, CloudWatch, or similar
- Experience with incident response practices including blameless post-mortems and resilience engineering concepts
- A desire to share your expertise through documentation, mentorship, and both written and vocal discussion
- Ability to work asynchronously and independently with minimal supervision, lead by example, and make decisions based on priorities and business goals
Benefits:
- Health insurance
- Retirement plans
- Paid time off
- Flexible work arrangements
- Professional development opportunities

















