Senior GenAI, High Performance Computing Delivery Engineer

Posted 1ds ago

Employment Information

Education
Salary
Experience
Job Type

Report this job

Job expired or something wrong with this job?

Job Description

Senior GenAI & HPC Engineer deploying GPU accelerated compute clusters for AI and ML at Dell Technologies. Collaborating with team members and customers to deliver cutting-edge HPC solutions.

Responsibilities:

  • Deploy, configure, and validate GPU accelerated compute clusters for AI, ML, and HPC with NVIDIA Base Command Manager (Warewulf and OpenHPC knowledge are a plus)
  • Perform benchmarking with HPL GPU, HPL MxP, STREAM, NCCL, RCCL, OSU Microbenchmarks, and related tools
  • Produce as-built documentation, performance reports, and share best practices amongst the team.
  • Configure and secure RHEL, Ubuntu, Rocky for GenAI or HPC workloads
  • Work directly with customers onsite (travel both regionally and across the U.S.)

Requirements:

  • 7+ years with HPC or GenAI clusters, GPU based systems, AI infrastructure, or related fields
  • Deep hands on experience with GPU deployment, configuration, and multi-node testing using NVIDIA Base Command Manager
  • Proficiency with benchmarking tools: HPL, STREAM, NCCL, RCCL, MxP, OSU Microbenchmarks
  • Red Hat certification (RHCSA/RHCE) or 7+ years of relevant RH distros experience
  • Experience with GenAI/HPC networking (InfiniBand and/or RoCE)
  • Experience working in Linux based parallel computing environments at scale
  • Experience with containers/orchestration (Docker, Singularity/Apptainer, Kubernetes, Slurm)
  • Ability to travel up to 70% of the time across the U.S . as needed for projects
  • Strong customer facing and communication skills

Benefits:

  • Your life. Your health. Supported by your benefits. You can explore the overall benefits experience that awaits you as a Dell Technologies team member — right now at MyWellatDell.com