Reliability Engineer
Posted 4hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Reliability Engineer at GDIT designing and maintaining infrastructure for federal government services. Collaborating with teams to enhance system reliability and automate deployment processes.
Responsibilities:
- Design, build, and maintain scalable, reliable infrastructure and services that support Hosting Services, Site Reliability Engineering, virtualization, and data center operations for a federal customer.
- Collaborate closely with software developers, infrastructure engineers, and IT operations teams to plan and execute deployments, improve system architectures, and enhance service reliability.
- Use automation and scripting (e.g., Python, Bash) to reduce manual work, streamline deployments, and improve consistency across environments.
- Monitor system performance, availability, and capacity using modern tooling; proactively identify issues and participate in on-call support to restore services quickly when incidents occur.
- Implement and support Continuous Integration/Continuous Delivery (CI/CD) pipelines using tools such as Jenkins, Git, and Terraform to enable reliable and repeatable releases.
- Leverage containerization and orchestration technologies such as Docker and Kubernetes to build resilient, scalable platforms.
- Work with databases (e.g., SQL, MySQL) and application stacks (e.g., Java-based services) to ensure data integrity, performance, and fault tolerance.
- Partner with cross-functional teams, using Jira and other collaboration tools, to track work, communicate status, and drive continuous improvement in reliability and operational excellence.
- Contribute to a culture of teamwork and collaboration by sharing knowledge, participating in post-incident reviews, and helping define best practices for reliability engineering.
Requirements:
- 5+ years of related experience in Site Reliability Engineering, DevOps, systems engineering, or software engineering roles
- Experience with deployments and production operations in Linux-based environments
- Proficiency with scripting/coding (e.g., Python, Java, shell scripting)
- Hands-on experience with AWS or other cloud platforms
- Strong Linux administration skills
- Experience with SQL/MySQL and database concepts
- Containerization and orchestration (Docker, Kubernetes)
- CI/CD and automation tools (Jenkins, Git, Terraform, Ansible)
- Experience with Infrastructure as Code ( IaC ) and automated configuration management
- Must have a BA/BS or equivalent
Benefits:
- Comprehensive benefits and wellness packages
- 401K with company match
- Paid time off
- Full flex work weeks where possible
- Variety of paid time off plans including vacation, sick and personal time, holidays, paid parental, military, bereavement and jury duty leave
- 15 days of paid leave per calendar year
- Additional 10 paid holidays per year
- Short and long-term disability benefits
- Life, accidental death and dismemberment, personal accident, critical illness and business travel and accident insurance


















