Senior Site Reliability Engineer
Posted 40ds ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Senior Site Reliability Engineer for SS&C Technologies managing resilient infrastructure platforms and services. Collaborating across teams to enhance application reliability and automation in financial services.
Responsibilities:
- Collaborate with Technology Infrastructure teams to build and operate reusable, cloud-native platforms that abstract complexity and accelerate delivery while incorporating reliability from design through operations.
- Work with business units and technical teams to improve application availability, observability, and reliability as our business applications are migrated to the Private Cloud.
- Enhance platform reliability through automatic problem detection, self-healing systems, and well-architected notification and escalation protocols.
- Use SLOs, SLIs, and KPIs to guide prioritization, measure impact, and drive continuous improvement.
- Eliminate toil using intelligent automation and agentic workflows.
- Conduct blameless retrospectives and share learnings across the organization.
- Foster a culture of ownership, positive thinking, and continuous learning while remaining grounded in practicality, experimentation, and engineering excellence.
- Integrate DevSecOps, zero-trust principles, and policy-as-code into every pipeline.
- Produce and promote Architecture Decision Records (ADRs) and Cloud Well-Architected Frameworks that our business units can adopt to improve our technology standardization.
- Maintain 24x5 active coverage with seamless regional handoffs and weekend escalation protocols.
Requirements:
- 5+ years of professional experience in a SRE role, with 3+ years in financial services or other regulated industries preferred.
- Minimum Bachelor’s degree in Computer Science, Engineering, or a related field.
- Proven expertise in architecting, designing and operating private cloud environments (e.g., VMware, OpenStack, OpenShift Virtualization) and Kubernetes clusters from a micro to a global scale.
- Hands-on experience with building, deploying, and operating infrastructure as code platforms, CI/CD pipelines, and observability platforms (e.g., Prometheus, Splunk).
- Strong understanding of modern systems reliability standards and practices, including establishing KPIs, monitoring and reporting on SLAs and SLOs, and sorting through the noise to establish actionable insights.
- Familiarity with various financial services regulatory frameworks and their impact on infrastructure design and operations.
- Familiarity with structured naming conventions and asset management for global infrastructure.
- Experience with financial-grade network segmentation, micro-segmentation, and zero-trust architecture.
- Certifications such as TOGAF, AWS Certified Solutions Architect, VMware VCP, or Red Hat Certified Architect are a plus.
- Familiarity with ISO 27001, NIST 800-53, and other security frameworks is a plus.
Benefits:
- Flexibility: Hybrid Work Model & a Business Casual Dress Code, including jeans
- 401k Matching Program, Professional Development Reimbursement
- Flexible Personal/Vacation Time Off, Sick Leave, Paid Holidays
- Medical, Dental, Vision, Employee Assistance Program, Parental Leave
- Discounts on fitness clubs, travel and more!




















