SRE Systems Engineer
Posted 13hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Join Salesforce's Site Reliability Engineering team ensuring cloud services availability and incident management. Collaborate with R&D to enhance system resilience and automation.
Responsibilities:
- Monitor customer-facing services and respond to Severity 0 (Sev0) and Severity 1 (Sev1) incidents, leading technical reviews and contributing to Root Cause Analyses (RCAs) handed off to the Global Solutions team.
- Automate the detection and resolution of recurring production issues to reduce engineering and operations toil.
- Contribute to compliance, resiliency, and self-healing initiatives including destructive testing and game day exercises.
- Partner with and mentor team members to stay current on industry technology and drive team development.
Requirements:
- Bachelor's degree in Computer Science, Information Systems, or a related technical field, or equivalent work experience.
- Experience in enterprise-scale internet service engineering or support, with strong Command Line Interface (CLI) knowledge of Unix variants including Red Hat Enterprise Linux.
- Expertise in Transmission Control Protocol/Internet Protocol (TCP/IP) networking technologies and protocols.
- Demonstrated experience with incident management and a solid understanding of IT Infrastructure Library (ITIL) service operations in a 24/7 environment.
- Proficiency writing scripts in Python, Go, or similar languages, with experience provisioning and operating Amazon Web Services (AWS) infrastructure.
Benefits:
- time off programs
- medical
- dental
- vision
- mental health support
- paid parental leave
- life and disability insurance
- 401(k)
- employee stock purchasing program


















