Remote Systems Reliability Engineer (SRE) - Cloud Infrastructure
About the Role
We are seeking a talented Remote Systems Reliability Engineer (SRE) to join our dynamic team. In this role, you will ensure the reliability and performance of our cloud infrastructure, leveraging your expertise in AWS, Terraform, and Kubernetes. As a Remote Systems Reliability Engineer, you will play a crucial role in maintaining our systems' uptime and efficiency, making a significant impact on our mission-driven company.
What You'll Do
- Design, implement, and manage cloud infrastructure using Infrastructure-as-Code (IaC) principles.
- Monitor system performance and troubleshoot issues to ensure optimal reliability and availability.
- Collaborate with development teams to enhance CI/CD pipelines and streamline deployment processes.
- Utilize scripting languages such as Python and Go to automate tasks and improve operational efficiency.
- Implement security best practices in compliance with NIST SP-800 53 and FISMA standards.
- Participate in on-call rotations to provide support for critical incidents.
- Work closely with cross-functional teams to drive improvements in system architecture and performance tuning.
- Contribute to the development of observability platforms and monitoring solutions.
Requirements
- 3+ years of experience as a Systems Reliability Engineer or in a similar role.
- Proficient in cloud platforms such as AWS and Azure.
- Strong knowledge of container orchestration tools like Kubernetes.
- Experience with Infrastructure-as-Code tools such as Terraform and CloudFormation.
- Familiarity with CI/CD practices and tools.
- Solid understanding of Linux system administration and database scripting.
- Excellent troubleshooting skills and a proactive approach to problem-solving.
- Strong communication skills and ability to work effectively in a remote team environment.
Nice to Have
- Experience with observability and monitoring tools.
- Knowledge of robotics and automation frameworks.
- Familiarity with compliance frameworks and security best practices.
What We Offer
- Make an impact working with a mission-driven company.
- Work with skilled and committed teammates in a sustainable remote-first environment.
- Rich medical, dental, vision, and EAP benefits.
- Caregiver leave for new parents.
- Reimbursements for wellness and learning development.
- Company-provided laptop and home-office stipend.
- Enroll in a 401k program.
- Receive LTSE equity options for full-time employees.
This Remote Systems Reliability Engineer position offers an exciting opportunity to work in a mission-driven company with a focus on cloud infrastructure. Enjoy competitive salary and comprehensive benefits.
Who Will Succeed Here
Proficient in AWS services such as EC2, S3, and RDS, with hands-on experience in deploying and managing cloud infrastructure using Infrastructure as Code tools like Terraform and Kubernetes.
Self-motivated with strong problem-solving skills, capable of working independently in a remote environment while managing time effectively and meeting project deadlines.
A growth mindset with a solid understanding of CI/CD practices, actively seeking to implement automation through scripting in Python or Go for enhanced system reliability.
Learning Resources
Career Path
Market Overview
Skills & Requirements
Domain Trends
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months