Remote Site Reliability Engineer - Innovative Tech Solutions
About the Role
Join Point72 as a Remote Site Reliability Engineer and be part of a cutting-edge technology team that is transforming the future of investing. In this role, you will play a crucial part in enhancing our IT infrastructure, ensuring system reliability, and implementing innovative solutions that drive our business forward. As a Remote Site Reliability Engineer, you will collaborate with various teams to improve application performance and reliability, making a significant impact on our operations.
What You'll Do
- Design and implement automated operational workflows to improve system reliability and reduce manual intervention.
- Build and maintain observability solutions using tools such as Datadog to deliver metrics, monitoring, alerting, and dashboards.
- Partner with development teams to enhance application reliability, deployment safety, and performance through SRE best practices.
- Develop and maintain CI/CD pipelines and deployment automation using Bitbucket, Jenkins, GitHub Actions, and related tooling.
- Engineer scalable solutions for production environments across Linux and Windows systems.
- Automate infrastructure and operational tasks using Python, PowerShell, Bash, or similar scripting languages.
- Support and enhance the reliability of database platforms such as SQL Server and MongoDB from an SRE perspective.
- Participate in incident response, drive root cause analysis, and implement long-term reliability improvements.
- Define and enforce SLOs, SLIs, and error budgets in partnership with application teams.
- Collaborate with Networking, Platform, and Security teams to ensure end-to-end system reliability.
- Enable self-service and standardized operational patterns for development teams.
Requirements
- Strong hands-on experience with Linux and Windows operating systems.
- Proven experience building automation and tooling using Python or similar languages.
- Deep understanding of observability and monitoring, preferably with Datadog.
- Experience with CI/CD pipelines and deployment automation (Bitbucket, GitHub Actions, Jenkins, etc.).
- Operational and performance knowledge of SQL Server and MongoDB.
- Familiarity with cloud platforms (AWS or similar) and hybrid architectures.
- Solid understanding of networking concepts such as DNS, load balancing, and TCP/IP.
- Experience working closely with application development teams in an SRE or DevOps role.
- Experience with Kubernetes, OpenShift, and containerized workloads.
- Knowledge of infrastructure-as-code tools (Terraform, CloudFormation, etc.).
Nice to Have
- Experience with monitoring tools beyond Datadog.
- Familiarity with Agile methodologies.
- Certifications in cloud platforms or DevOps practices.
What We Offer
- Competitive salary and performance bonuses.
- Remote work flexibility with a focus on work-life balance.
- Opportunities for professional development and continuous learning.
- Collaborative and innovative work environment.
- Health and wellness benefits.
- Access to cutting-edge technology and tools.
This Remote Site Reliability Engineer position at Point72 offers a unique opportunity to work with cutting-edge technology in a collaborative environment. With competitive salary and remote flexibility, it's an attractive role for tech professionals.
Who Will Succeed Here
Proficient in managing and automating Linux and Windows environments, with a strong command of scripting in Python to streamline operations and enhance system reliability.
Adaptable and self-motivated, thriving in a remote work setting by effectively managing time and priorities while collaborating with cross-functional teams using tools like Bitbucket and GitHub Actions.
Hands-on experience with CI/CD pipelines and monitoring tools such as DataDog, combined with a proactive mindset towards troubleshooting and performance optimization of SQL Server and MongoDB.
Learning Resources
Career Path
Market Overview
Skills & Requirements
Domain Trends
Industry News
Loading latest industry news...
Finding relevant articles from the last 6 months