Senior Infra Engineer – Observability
Posted 3ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Build observability systems for handling telemetric data streams at Railway. Ensure system resilience and scalability while collaborating across teams.
Responsibilities:
- Build ingestion pipelines to consume 1M+ RPS streams of logs, metrics, and other telemetry
- Build scalable, fault tolerant alerting engines for notifying users, in real-time, of threshold breaches
- Craft rich backend observability APIs, working with product to build amazing experiences
- Provide APIs to access realtime log/metrics streams to be consumed by the Dashboard and Product Teams
- Build Golang/Rust GRPC services from scratch capable of supporting tens of thousands of users
- Define infrastructure that can be torn down, failed over, and reconstituted from scratch using principle of immutable infrastructure using Terraform and Ansible
- Write Engineering Requirement Documents to take something from idea to success
- Interface with our TypeScript and GraphQL edge to expose your microservice APIs for both internal and potentially external consumption
Requirements:
- A strong understanding of distributed systems
- Interests in VictoriaMetrics, ClickHouse, and other systems for building observability stacks from the ground up
- A solid intuition about how long your solutions will last
- The tact to implement your solution, creator monitors for its error boundaries, and document any requirements for when you’re not around
- A great sense of direction and prioritization when it comes to dealing with the ambiguity of an early stage startup
- A sense of grit to dive into a problem, implement a solution, scale that solution, and replace it when needed
- A great set of communication skills for getting your point across, solution implemented, and beyond
Benefits:
- Full health benefits including dependents
- Strong equity grants
- Equipment stipend


















