Oracle10.03.26
AI SCORE 8.5

Senior Principal Software Engineer - AI Infrastructure Remote

$97K–$252K/year

About the Role

We are seeking a highly skilled Senior Principal Software Engineer - AI Infrastructure Remote to join our GPU Availability and Monitoring team at Oracle. This role is crucial for designing and developing architectural changes for GPU delivery, health monitoring, triage automation, and diagnostic services essential for running distributed AI/ML/HPC workloads across thousands of GPUs. You will be responsible for architecting solutions that scale and optimize monitoring and repair solutions for AI infrastructure components, ensuring peak performance for customer workloads.

What You'll Do

  • Architect solutions to scale and optimize Monitoring and Repair for components like GPU, CPU, Network, and Storage.
  • Develop best-in-class AI compute infrastructure, ensuring services are modularized, secure, reliable, and actively monitored.
  • Collaborate with cross-functional teams to understand requirements and design respective solutions.
  • Optimize software development processes to improve developer efficiency.
  • Mentor junior developers and drive modern software engineering practices.
  • Develop benchmark metrics and automation to track performance and reliability across customer workloads.
  • Stay updated with industry trends and emerging technologies in distributed systems and AI infrastructure management.

Requirements

  • BS in Computer Science, Engineering, or related field.
  • 10 years of experience in software development with languages including C, C++, C#, Java, Go, Rust.
  • 5 years of experience designing large-scale distributed systems.
  • 3 years of experience providing technical leadership to cross-functional teams.
  • Strong communication skills and a systematic problem-solving approach.
  • Experience with cloud infrastructures such as OCI, AWS, Azure, and GCP.
  • Familiarity with containerization technologies like Docker and API design.

Nice to Have

  • Experience with AI-powered tools and platforms.
  • Familiarity with data management practices.
  • Knowledge of Agile development methodologies.

What We Offer

  • Comprehensive benefits package including medical, dental, and vision insurance.
  • 401(k) with company match and flexible spending accounts.
  • Paid time off and sick leave policies.
  • Employee Stock Purchase Plan and financial planning assistance.
  • Opportunities for professional growth and development.
Language Requirements
EnglishC1
BasicIntermediateAdvancedNative
Why This Job8.5 of 10

This role offers a unique opportunity to lead AI infrastructure projects at Oracle, with a competitive salary and comprehensive benefits.

Salary Range
Required
0/1
Optional
0/1
Bonus
0/1

Who Will Succeed Here

Expertise in C, C++, and Java with a strong understanding of performance optimization in distributed systems, particularly in AI/ML workloads.

Proficiency in Docker and cloud infrastructure services (e.g., AWS, Azure) to efficiently manage containerized applications and deployment pipelines in a remote environment.

A mindset geared towards continuous improvement and scalability, with a focus on architecting robust API designs that support high availability and fault tolerance.

Learning Resources

C Programming Languageguide

Career Path

Senior Principal Software Engineer - AI Infrastructure Remote(Now)Technical Architect - AI Solutions(1-2 years)Director of Engineering - AI Infrastructure(3-5 years)

Market Overview

Market Size 2024
$15.4B
Annual Growth
8.7%
AI Adoption
78%
Investment in AI Infrastructure
+45%
Labour Demand for C/C++ Developers
+25%
Avg Salary for Senior Software Engineers
$150K

Skills & Requirements

Required
CC++Java
Growing in Demand
KubernetesMicroservices ArchitectureMachine Learning Frameworks
Declining
PerlVB.NET

Domain Trends

Increased Demand for Cloud-Native Applications
With 70% of enterprises planning to migrate to cloud-native architectures, skills in distributed systems and cloud infrastructure are increasingly critical.
Rise of AI-Driven Development Tools
AI tools are expected to automate 35% of coding tasks by 2025, highlighting the need for engineers to integrate AI solutions into their workflows.
Focus on Secure Software Development
Cybersecurity incidents have risen by 50% in the past year, prompting a shift towards secure coding practices and infrastructure security in software development.

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.