Senior Site Reliability Engineer
Mission Statement \& Summary
As a Senior Site Reliability Engineer, you'll sit at the intersection of software engineering and operations, driving reliability, performance, automation, and resilience across our technology estate.
This is an opportunity to shape the future of our platform rather than simply maintain it. You'll work alongside talented engineers, influence technical direction, and champion modern reliability practices that enable teams to move faster with confidence. If you're passionate about solving complex problems, eliminating toil through automation, and creating systems that are resilient by design, we'd love to hear from you.
How you'll contribute
- You'll lead initiatives that improve platform reliability, scalability, and operational excellence.
- You'll design and deliver automation solutions that reduce manual effort and accelerate engineering teams.
- You'll develop observability capabilities, enabling proactive monitoring and faster incident resolution.
- You'll facilitate incident management, driving root cause analysis and continuous improvement.
- You'll collaborate with engineering teams to embed reliability, resilience, and performance into every stage of delivery.
- Software engineering and design experience (preferably .net/C\#), to build and improve production systems, apply solid design principles, and contribute directly to codebases to deliver reliable, scalable, and maintainable services.
- The ability to automate infrastructure, operational processes, and deployments using modern engineering practices.
- Experience building effective observability solutions, including monitoring, logging, alerting, and tracing.
- Strong problem\-solving skills with the ability to diagnose and resolve complex production issues.
- The ability to influence technical decisions and collaborate effectively across engineering and business teams.
- Experience operating Kubernetes\-based platforms at scale.
- Knowledge of Infrastructure as Code tools and cloud platform services.
- Experience implementing Site Reliability Engineering principles, including SLOs, SLIs, and error budgets.
- Familiarity with security, compliance, and resilience best practices within cloud environments.
- Experience mentoring engineers and helping teams adopt modern operational and reliability practices.
Tell us if you need accommodations: We’ll put reasonable adjustments in place to support you.
We work with Textio to make our job design and hiring inclusive.
PermanentSenior SRE role profile.docx
This listing is from indeed. View original listing ↗