Remote Principal Software Engineer - AI Inference Systems
About the Role
We are seeking a Remote Principal Software Engineer to join our innovative NVIDIA Dynamo team, focused on building scalable AI inference systems. This role offers an exciting opportunity to work on cutting-edge technology in distributed GPU environments, enhancing the performance of large language models (LLMs) and reasoning systems.
What You'll Do
- Collaborate on the design and development of the Dynamo Kubernetes stack, enhancing the deployment of AI inference systems.
- Introduce new features to the Dynamo Python SDK and Rust Runtime Core Library, ensuring robust performance.
- Design, implement, and optimize distributed inference components using Rust and Python, contributing to high-performance AI workloads.
- Architect and optimize the separation of prefill and decode phases across GPU clusters to improve throughput.
- Develop dynamic GPU scheduling algorithms to ensure efficient resource allocation based on workload demands.
- Enhance intelligent routing systems to minimize latency and improve request handling for complex reasoning tasks.
- Contribute to open-source repositories, participate in code reviews, and assist with issue triage on GitHub.
- Write clear documentation and contribute to user and developer guides, ensuring a smooth experience for the community.
Requirements
- BS/MS or higher in computer engineering, computer science, or a related field (or equivalent experience).
- 15+ years of proven experience in software engineering, particularly in systems programming.
- Strong proficiency in Rust and/or C++, with experience in Python for workflow and API development.
- Experience with Kubernetes and cloud-native deployment practices.
- Deep understanding of distributed systems, parallel computing, and GPU architectures.
- Familiarity with large-scale inference serving and high-performance AI workloads.
- Excellent problem-solving and communication skills.
- Prior contributions to open-source AI frameworks are a plus.
Nice to Have
- Experience with GPU resource scheduling and cache management.
- Understanding of LLM-specific inference challenges, such as context window scaling.
What We Offer
- Highly competitive salaries with a base salary range of $272,000 - $431,250.
- Eligibility for equity and a comprehensive benefits package.
- Opportunities to work with some of the most forward-thinking and hardworking people in the technology industry.
- A diverse work environment that values inclusion and equal opportunity.
- Remote work flexibility and a chance to contribute to groundbreaking AI technology.
This Remote Principal Software Engineer position at NVIDIA offers a unique opportunity to work on cutting-edge AI inference systems with a competitive salary and equity options.
Who Will Succeed Here
Proficient in Rust and Python, with hands-on experience in developing distributed systems that leverage GPU architecture for AI inference, ensuring high performance and scalability.
Self-motivated and adaptable, thriving in a remote work environment by effectively managing time and prioritizing tasks while collaborating with cross-functional teams across different time zones.
Strong background in Kubernetes and Docker for orchestrating containerized applications, combined with a deep understanding of AI inference systems and large language models, enabling innovative solutions in complex technical challenges.
Learning Resources
Career Path
Market Overview
Skills & Requirements
Domain Trends
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months