SRE Systems Engineer

Posted 13hrs ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Join Salesforce's Site Reliability Engineering team ensuring cloud services availability and incident management. Collaborate with R&D to enhance system resilience and automation.

Responsibilities:

  • Monitor customer-facing services and respond to Severity 0 (Sev0) and Severity 1 (Sev1) incidents, leading technical reviews and contributing to Root Cause Analyses (RCAs) handed off to the Global Solutions team.
  • Automate the detection and resolution of recurring production issues to reduce engineering and operations toil.
  • Contribute to compliance, resiliency, and self-healing initiatives including destructive testing and game day exercises.
  • Partner with and mentor team members to stay current on industry technology and drive team development.

Requirements:

  • Bachelor's degree in Computer Science, Information Systems, or a related technical field, or equivalent work experience.
  • Experience in enterprise-scale internet service engineering or support, with strong Command Line Interface (CLI) knowledge of Unix variants including Red Hat Enterprise Linux.
  • Expertise in Transmission Control Protocol/Internet Protocol (TCP/IP) networking technologies and protocols.
  • Demonstrated experience with incident management and a solid understanding of IT Infrastructure Library (ITIL) service operations in a 24/7 environment.
  • Proficiency writing scripts in Python, Go, or similar languages, with experience provisioning and operating Amazon Web Services (AWS) infrastructure.

Benefits:

  • time off programs
  • medical
  • dental
  • vision
  • mental health support
  • paid parental leave
  • life and disability insurance
  • 401(k)
  • employee stock purchasing program