NVIDIA is seeking a hands-on Senior Network Performance Engineer to join our Networking Insights team. This role is for an investigative engineer who will thrive in our diagnostics lab while also solving the most complex performance challenges in AI data centers.
We're looking for an engineer to investigate hardware behaviours, particularly the subtle phenomena that emerge when running demanding AI workloads at scale. If your passion is for deep, investigation work to understand complex behaviours, this role will place you at the forefront of the AI revolution."
What You'll Be Doing
Experimental Root Cause Analysis: Design and build targeted experiments from the ground up to replicate complex hardware behaviors observed in AI data centers. You will then analyze the results to hunt down the root cause of these behaviors, whether in silicon, firmware, or software.
Hands-On Lab Investigation: Spend your time in the lab, working directly with the most advanced networking ASICs and systems to profile performance and characterize behavior under stress.
Test Automation and Development: Write and debug advanced automation scripts (Python) to programmatically control traffic generators (e.g., IXIA) and manipulate the test environment to expose corner-case issues.
Lab Environment Management: Maintain and support our lab environment, including equipment setup and racking, procurement, inventory, and coordinating maintenance and upgrades to support ongoing investigations.
What We Need to See
B.S.c in Engineering/Computer Science or equivalent experience with a strong foundation in hardware-software interaction
5+ years of deep hands-on lab experience focused on hardware validation, testing, and performance-tuning.
Strong proficiency in Python for test automation, hardware diagnostics, and data analysis.
Familiarity with basic networking concepts (Ethernet, Routing) and large-scale network design.
Proven ability to collaborate effectively with multi-functional teams, including hardware, software, and architecture groups.
Curiosity and a problem-solving approach, driven to understand how things work at the fundamental level.
Ways to Stand Out From the Crowd
Expertise in Ethernet protocols, L2/L3 routing, and large-scale data center network topologies.
Proficient in scripting tools for traffic generation (e.g., IXIA, Spirent) to compose intricate traffic scenarios, rather than simply running pre-existing scripts.
Expertise in validating and stress-testing network systems at the component level (e.g., NICs, Switches), with a focus on hardware diagnostics beyond standard protocol testing.
Familiarity with the unique network architectures and operational challenges of large-scale AI, HPC, or hyperscale data center environments (e.g., RDMA/RoCE, congestion control, high-radix fabrics).
Hands-on experience in configuring and managing datacenter network equipment.
NVIDIA has some of the most forward-thinking and hardworking people in the world working for us, and due to unprecedented growth, our world-class engineering teams are growing fast. If you're a creative and autonomous engineer with a real passion for technology, we want to hear from you.
We are committed to fostering a diverse work environment and are proud to be an equal-opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive other benefits and privileges of employment. Please contact us to request accommodation.