NVIDIA07.03.26
AI SCORE 9.0

Remote Principal Software Engineer - AI Infrastructure

$272K–$431K/year

About the Role

We are seeking a Remote Principal Software Engineer to join our innovative team at NVIDIA Dynamo. This role focuses on building scalable AI infrastructure for large language models and reasoning systems. You will be part of a dynamic team dedicated to addressing the most challenging issues in distributed AI infrastructure, ensuring high-performance AI inference for demanding applications.

What You'll Do

  • Collaborate on the design and development of the Dynamo Kubernetes stack, enhancing the Remote Principal Software Engineer capabilities.
  • Introduce new features to the Dynamo Python SDK and Rust Runtime Core Library.
  • Design, implement, and optimize distributed inference components in Rust and Python.
  • Contribute to the development of disaggregated serving for various inference engines.
  • Improve intelligent routing and KV-cache management subsystems.
  • Engage with the open-source community, participate in code reviews, and assist with issue triage on GitHub.
  • Write clear documentation and contribute to user and developer guides.

Requirements

  • BS/MS or higher in computer engineering, computer science, or related field.
  • 15+ years of proven experience in software engineering, particularly in systems programming.
  • Strong proficiency in Rust and/or C++, with experience in Python for workflow and API development.
  • Experience with Go for Kubernetes controllers and operators development.
  • Deep understanding of distributed systems, parallel computing, and GPU architectures.
  • Experience with cloud-native deployment and container orchestration (Kubernetes, Docker).
  • Familiarity with open-source development workflows (GitHub, CI/CD).
  • Excellent problem-solving and communication skills.

Nice to Have

  • Prior contributions to open-source AI inference frameworks.
  • Experience with GPU resource scheduling, cache management, or high-performance networking.
  • Understanding of LLM-specific inference challenges.

What We Offer

  • Highly competitive salary ranging from $272,000 to $431,250 based on experience and location.
  • Equity options available.
  • Comprehensive benefits package including health, wellness, and retirement plans.
  • Remote work flexibility with a focus on work-life balance.
  • Opportunities for professional development and growth within a leading tech company.
Why This Job9.0 of 10

This Remote Principal Software Engineer position at NVIDIA offers a chance to lead innovative AI infrastructure projects with a competitive salary and equity options.

Salary Range
Required
0/1
Optional
0/1
Bonus
0/1

Who Will Succeed Here

Deep expertise in Rust, C++, and Python for developing high-performance AI infrastructure, with a focus on efficient memory management and concurrency in distributed systems.

Strong self-motivation and proactive problem-solving skills, essential for thriving in a fully remote environment while collaborating with cross-functional teams on complex projects.

Extensive experience in Kubernetes and Docker for container orchestration and deployment of AI applications, along with a solid understanding of GPU architecture to optimize performance.

Learning Resources

The Rust Programming Languageguide

Career Path

Remote Principal Software Engineer - AI Infrastructure(Now)Lead Software Architect in AI Systems(1-2 years)Director of AI Infrastructure Engineering(3-5 years)

Market Overview

Rust Market Size 2024
$1.5B
Annual Growth
15.2%
AI Adoption in Software Development
45%
Investment in AI Infrastructure
+120%
Labour Demand for AI Engineers
+30%
Avg Salary for Principal Software Engineers
$180K

Skills & Requirements

Required
RustC++Python
Growing in Demand
Machine LearningCloud-native DevelopmentData Engineering
Declining
JavaPHP

Domain Trends

Increased Adoption of Rust in AI
Rust is being increasingly adopted for AI infrastructure due to its performance and safety features, with a reported 30% increase in usage among AI projects in 2024.
Shift to Cloud-native Architectures
Over 50% of organizations are moving towards cloud-native architectures, emphasizing the need for skills in Kubernetes and Docker for scalable AI solutions.
Focus on GPU Optimization
With the rise of AI workloads, 70% of companies are investing in GPU architecture optimization, making expertise in GPU programming essential for software engineers.

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.