Senior Solutions Architect – Infiniband, Networking, Ethernet
Posted 1hrs ago
Employment Information
Report this job
Job expired or something wrong with this job?
Job Description
Senior Solutions Architect specializing in building AI/HPC infrastructure for NVIDIA. Focused on operational reliability and lifecycle improvement for large-scale networking projects.
Responsibilities:
- Primary responsibilities will include building AI/HPC infrastructure for new and existing customers.
- Support operational and reliability aspects of large-scale AI clusters, focusing on performance at scale, real-time monitoring, logging, and alerting.
- Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation, and refinement.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Provide feedback to internal teams such as opening bugs, documenting workarounds, and suggesting improvements.
Requirements:
- BS/MS/PhD or equivalent experience in Computer Science, Electrical/Computer Engineering, Physics, Mathematics, or related fields.
- At least 8 years of professional experience in networking fundamentals, TCP/IP stack, and data center architecture
- Proficiency in configuring, testing, validating, and resolving issues in LAN and InfiniBand networks, especially in medium to large-scale HPC/AI environments.
- Advanced knowledge of EVPN, BGP, OSPF, VXLAN protocols.
- Hands-on experience with network switch/router platforms like Cumulus Linux, SONiC, IOS, JunosOS, and EOS.
- Extensive experience delivering automated network provisioning solutions using tools like Ansible, Salt, and Python.
- Ability to develop CI/CD pipelines for network operations.
- Strong focus on customer needs and satisfaction.
- Strong written, verbal, and listening skills in English and Japanese are essential.
Benefits:
- Health insurance
- Professional development opportunities

















