SR231.01.26
AI SCORE 8.5

Senior Site Reliability Engineer (SRE) - Cloud-Native Platforms

$90K–$120K/year

About the Role

We are looking for a Senior Site Reliability Engineer (SRE) remote to join our team at a leading German technology company. This fully remote role is focused on cloud-native platforms that serve both internal engineering teams and external customers at scale. As a Senior SRE, you will take real ownership of stability, observability, and performance, ensuring that reliability is treated as a key product feature.

What You'll Do

  • Own and improve system reliability, uptime, and performance.
  • Design and operate observability stacks including metrics, logs, and traces.
  • Define and implement Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets.
  • Conduct load testing, performance tuning, and capacity planning.
  • Reduce operational toil through automation and tooling.
  • Lead or contribute to incident response and post-incident reviews.
  • Collaborate closely with engineers to embed reliability-by-design into the development process.

Requirements

  • Strong experience as a Site Reliability Engineer, Platform Engineer, or senior DevOps engineer.
  • Hands-on production experience with Kubernetes.
  • Solid understanding of observability and incident management.
  • Automation mindset and comfort writing production-quality code.
  • Calm, methodical approach to problem-solving in live environments.
  • Professional proficiency in the German language (spoken and written).

Nice to Have

  • Experience with Docker and Helm for packaging and deployment.
  • Familiarity with Python or TypeScript for automation and tooling.
  • Knowledge of modern monitoring and observability platforms.
  • Experience in cloud-native and container-first architecture.

What We Offer

  • Fully remote work opportunity within Germany.
  • High ownership and technical influence in your role.
  • Clear commitment to SRE best practices and an engineering-driven culture.
  • Engagement with real-world scale and meaningful reliability challenges.
  • Minimal bureaucracy, allowing for efficient decision-making and innovation.
Language Requirements
GermanC1
BasicIntermediateAdvancedNative
Why This Job8.5 of 10

This Senior Site Reliability Engineer role offers a unique opportunity to work fully remote in Germany, focusing on cloud-native platforms with a strong emphasis on reliability and observability.

Salary Range
Required
0/1
Optional
0/1
Bonus
0/1

About SR2

Explore SR2 careers in 2026 to find exciting job opportunities across remote, hybrid, and office roles. Utilize our advanced filters to tailor your job search, track your application status, and gain valuable insights about the company. Discover your ideal position at SR2 and unlock your potential in a dynamic work environment. Start your journey towards a fulfilling career today!

Industry
Tech
Location
Remote

Who Will Succeed Here

Proficiency in Kubernetes and Docker for managing container orchestration and deployment, with hands-on experience in setting up Helm charts for application management.

Strong automation mindset with experience in Python and TypeScript to create scripts and tools that enhance observability and incident management processes.

Experience in a remote work environment with a proactive approach to problem-solving and a strong sense of ownership over system reliability and performance.

Learning Resources

Kubernetes Documentationguide

Career Path

Senior Site Reliability Engineer (SRE) - Cloud-Native Platforms(Now)Lead Site Reliability Engineer(1-2 years)Director of Site Reliability Engineering or Cloud Infrastructure Architect(3-5 years)

Market Overview

Kubernetes Market Size 2024
$13.4B
Annual Growth
26.8%
AI Adoption in SRE
45%
Investment in Cloud-Native Technologies
+150%
Labour Demand for SRE Roles
+30%
Avg Salary for Senior SRE
$140K

Skills & Requirements

Required
KubernetesDockerHelm
Growing in Demand
GitOpsService MeshPrometheus
Declining
Traditional VirtualizationManual Deployment Processes

Domain Trends

Increased Adoption of GitOps
Organizations are increasingly adopting GitOps practices, with 60% of companies reporting improved deployment frequency and reduced lead time.
Rise of Service Mesh Technologies
Service mesh usage has grown by 70% among enterprises, enhancing microservices communication and observability.
Focus on AI-Driven Incident Management
45% of SRE teams are integrating AI tools for incident management, leading to a 40% reduction in mean time to recovery (MTTR).

Industry News

Loading latest industry news...

Finding relevant articles from the last 6 months

All job postings are automatically gathered by algorithms. We do not review or verify listings, be careful when applying and do not sign-in with iCloud or Google services.