Staff Software Engineer – Cloud Platform, Kafka

Posted 8hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Staff Cloud Platform Engineer facilitating cloud-based Kafka deployment on GCP and AWS. Collaborating with teams to enhance event streaming infrastructure at a leading CSP.

Responsibilities:

  • Design, provision, and manage Apache Kafka clusters (self-managed on GCP/AWS or via Confluent Platform / MSK).
  • Configure and tune brokers, ZooKeeper/KRaft, topics, partitions, replication factors, and retention policies for high throughput and low latency.
  • Perform cluster upgrades, rolling restarts, and broker replacements with zero downtime.
  • Implement and manage Kafka Connect pipelines for data ingestion and egress across heterogeneous systems.
  • Administer Kafka Streams and ksqlDB deployments for real-time stream processing workloads.
  • Maintain Schema Registry and enforce schema governance standards across teams.
  • Define and track SLIs/SLOs for consumer lag, throughput, end-to-end latency, and broker health.
  • Design and implement cloud infrastructure using IaC – Terraform
  • Build automated deployment pipelines for Kafka configuration changes using GitOps workflows (ArgoCD, Flux).
  • Create self-service tooling and runbooks to reduce toil for development teams.
  • Automate topic provisioning, ACL management, and schema registration via APIs and CLI tooling.
  • Integrate tools like GitLab CI/CD , or Cloud Build for automated testing and deployment.
  • Ensure seamless integration of data pipelines with other GCP services like Big Query, Cloud Storage.
  • Monitor and Optimize performance, reliability, and cost of Kafka and streaming pipelines.
  • Implement security best practices for GCP resources, including IAM policies, encryption, and network security.
  • Ensure Observability is an integral part of the infrastructure platforms and provides adequate visibility about their health, utilization, and cost.
  • Collaborate extensively with cross functional teams to understand their requirements; educate them through documentation/trainings and improve the adoption of the platforms/tools.

Requirements:

  • 10+ years of overall experience in DevOps cloud engineering, or data engineering.
  • 5+ years of experience in Kafka at production scale.
  • Deep expertise in Kafka internals: replication protocol, log compaction, consumer group coordination, partition leadership, and KRaft mode
  • Proficiency with container orchestration (Kubernetes / Helm) and deploying Kafka via Strimzi, Confluent Operator, or equivalent
  • Strong understanding of networking (VPC, peering, private endpoints, DNS, load balancing) in cloud environments.
  • Hands-on experience with Kafka Connect, Schema Registry, and at least one stream processing framework (Kafka Streams, Flink, Spark Structured Streaming).
  • Proficiency in Google Cloud Platform (GCP) services, including Dataflow, Pub/Sub, Kafka, Dataproc, Big Query, and Cloud Storage.
  • Expertise in Infrastructure as Code (IaC) tools like Terraform or Cloud Deployment Manager.
  • Familiarity with data orchestration tools like Apache Airflow or Cloud Composer.
  • Experience with CI/CD tools like Jenkins, GitLab CI/CD, or Cloud Build.
  • Knowledge of containerization and orchestration tools like Docker and Kubernetes.
  • Strong scripting skills for automation (e.g., Bash, Python).
  • Experience with monitoring tools like Cloud Monitoring, Prometheus, and Grafana.
  • Familiarity with logging tools like Cloud Logging or ELK Stack.
  • Strong problem-solving and analytical skills.
  • Excellent communication and collaboration abilities.
  • Ability to work in a fast-paced, agile environment.

Benefits:

  • Health insurance
  • 401(k) matching
  • Flexible work hours
  • Paid time off