Principal Systems Engineer – HPC/AI System Administrator, Multi-discipline Expert

Posted 1ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Systems Engineer Principal at GDIT sustaining NASA’s HPC systems for operational weather and climate forecasts. Collaborating with interdisciplinary teams to enhance forecasting capabilities.

Responsibilities:

  • Lead/Manage/Support daily HPC system operations and reliability of scheduling and system software
  • Collaborate with GDIT HPC engineers, system administrators, developers, and NWS operational staff
  • Drive improvements in system efficiency, scheduler reliability, and operational resiliency
  • Utilize Linux system administration, HPC scheduler expertise, scripting languages, performance monitoring tools

Requirements:

  • 15+ years of related experience
  • US Citizenship Required
  • Bachelor of Arts/Bachelor of Science
  • Linux system administration (Rocky/SLES preferred)
  • Experience with HPC batch schedulers (PBS Pro, Slurm, or similar)
  • Scripting abilities (Bash, Python, Perl)
  • Understanding of HPC architectures, distributed computing, and MPI-based workloads
  • Troubleshooting skills across multi-node HPC environments

Benefits:

  • Health insurance
  • 401(k) with company match
  • Comprehensive benefits and wellness packages
  • Paid time off including vacation, sick and personal time
  • 15 days of paid leave plus 10 paid holidays per year
  • Paid parental leave, military, bereavement, and jury duty leave
  • Short and long-term disability benefits
  • Life and accident insurance