Senior Site Reliability Engineer, Observability

Posted 115ds ago

Employment Information

Industry

Education

Salary

Experience

Job Type

Location

Report this job

Job expired or something wrong with this job?

Job Description

Senior Site Reliability Engineer at Chainlink focusing on observability and reliability in decentralized finance solutions. Supporting engineering teams and enhancing self-service capabilities for development.

Responsibilities:

Build and orchestrate Modern OTEL-based Observability Platform
Support multiple telemetry types, like metrics, logs and traces.
Define and support modern governance in observability and problems at scale.
Ensure reliability, security, and performance exceed our defined SLAs
Work with engineers from across the company to help troubleshoot issues, deploy new products and services, and increase velocity while decreasing cognitive load
Lead the design and deployment of monitoring/observability services to detect and alert the team of needed action.
Ingest, aggregate, transform, and utilize data from a multitude of sources in our real time data pipeline.
Oversee the availability, performance, and supportability of our observability infrastructure.
Create processes around alert response operations and support the team to ensure the reliable delivery of oracle data.
Make recommendations to ensure sufficient metrics are collected to create alerts with every new feature release.
Champion reliability and security by taking the time to do your work right the first time

Requirements:

7+ years of relevant professional experience. You probably have worked on a devops, infrastructure, SRE, and/or platform team before
Ability to develop software outside of the scope of typical infrastructure requirements and configurations
Experience programming in C, C++, Java, Python, Go, Perl, or Ruby
Expert knowledge in all aspects of designing, developing, and managing large real-time systems
Experience with monitoring and logging. You know how to export metrics using Prometheus, have built a Grafana dashboard or two, and have experience with a centralized logging solution like an ELK Stack, Splunk or Grafana Stack.
Experience with distributed systems and container orchestration. You have maintained or even built Kubernetes clusters before and feel comfortable deploying completely new services on them
Strong communication skills. You can give and receive constructive feedback, and you do not shy away from planning meetings and code reviews

Benefits:

Health insurance
401(k) matching
Flexible work hours
Paid time off
Remote work options

Senior Site Reliability Engineer, Observability

Employment Information

Report this job

Job Description

Responsibilities:

Requirements:

Benefits:

Chainlink Labs

Report this job

Similar Jobs

Leidos

Centene Corporation

Gifthealth

Zeta Global

Verity Group

Supabase

Travelers

SysMap Solutions

Alten México

Lanlink Informática Ltda.

Gugu Robotics

YDUQS

Coderio

Zapier

GXA

Stefanini Brasil

Yelp

Cognitive Medical Systems, Inc.

Gorilla Logic

Verity Group