NVIDIA is now looking for a Principal Software Architect in our federated AI group. The quality of AI models developed is only as good as the data it was trained on, and hence large datasets are necessary to extract complex and predictive patterns. Compared to the classical centralized training approach, federated AI is a privacy-preserving, distributed learning paradigm which tackles the challenges associated with learning from data in a decentralized way.
We believe federated AI will drive transformative changes across industries like healthcare, financial services, scientific computing, and government. Imagine developing AI solutions that enable hospitals to collaborate on life-saving research without compromising patient privacy, or financial institutions to enhance fraud detection while safeguarding sensitive data. Envision edge AI applications that revolutionize autonomous driving, creating vehicles that learn and adapt in real-time. As a Principal Architect of NVIDIA FLARE team (https://developer.nvidia.com/flare), you will advance federated AI with innovative decentralized AI solutions. Your work will drive the next wave of innovation, empowering organizations to unlock the full potential of their data while maintaining the highest standards of security and privacy. 
What you’ll be doing:
Collaborate with AI researchers and industry leaders to architect future-proof federated computing infrastructure.
Lead the design and development of cutting-edge federated computing system that enable secure, large-scale distributed AI.
Optimize platform performance, scalability, security, privacy, reliability, and ease of use for enterprise deployments.
Drive architectural innovations that accelerate federated learning adoption across research communities and industries worldwide.
Mentor junior engineers and establish engineering excellence through best practices, fostering a culture of continuous innovation.
What we need to see:
BS, MS, or PhD in Computer Science, Electrical Engineering, or related field (or equivalent experience).
15+ years of experience in distributed systems development and architecture.
Proven track record of designing, developing, and evangelizing distributed software platforms.
Deep knowledge of distributed systems: design patterns, workflows, networking, and implementation.
Expert-level programming skills in Python and C++.
Strong expertise in networking and enterprise security.
Excellent system and API design capabilities.
Strategic technical leadership with ability to drive long-term vision beyond tactical execution.
Outstanding communication and collaboration skills.
Self-motivated with proven ability to deliver high-quality solutions independently
Experience contributing to major open-source projects is a plus.
Experience with federated learning frameworks (FLARE, Flower, OpenFL, FedML, PySyft) is plus
Knowledge of ML/DL frameworks (PyTorch, TensorFlow) is a plus.
Ways to stand out from the crowd:
Proven architecture experience in major open-source projects.
Track record of designing, developing, and evangelizing distributed ML infrastructure.
Strong sense of user-friendly API design and developer experience.
Deep expertise in distributed PyTorch architecture, design principles, and implementation tradeoffs is a strong plus
With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people in the world working for us. If you're creative, autonomous and love a challenge, we want to hear from you. Come, join our group and help build the real-time, cost-effective computing platform driving our success in the exciting and quickly growing field of AI.
You will also be eligible for equity and benefits.