
Senior Site Reliability Engineer
- Cardiff
- £60,000-70,000 per year
- Permanent
- Full-time
📍 Remote
💰 Up to £70,000 + annual share scheme + excellent benefitsWhat You'll Do:You'll take a lead role in driving operational excellence, ensuring the resilience, observability, and performance of web-based systems across a growing digital platform. Working within a collaborative, cross-functional environment, you'll design scalable infrastructure, automate operations, and embed SRE principles to improve reliability and reduce toil.This is a highly influential role where you'll guide engineering standards, support incident management, and mentor others in building robust, cloud-native systems using modern DevOps practices.What You'll Bring:
- Strong experience supporting complex web applications and distributed systems, including Micro Frontends and BFFs
- Hands-on expertise in React and TypeScript development with an eye for performance and resilience
- Proven ability to implement observability practices using tools like Prometheus, Grafana, or Azure Monitor
- Proficiency in containerisation and orchestration (Docker, Kubernetes - ideally AKS or GKE)
- Experience building and maintaining CI/CD pipelines for frontend applications (e.g. Azure DevOps, GitHub Actions)
- Solid grasp of cloud infrastructure (Azure or GCP), networking, and security best practices for web platforms
- Knowledge of SRE frameworks including SLOs, SLIs, error budgets, and incident response
- Familiarity with testing tools such as Playwright, Vitest, and Jest
- Understanding of infrastructure-as-code (Terraform) and DevSecOps is a plus
Know someone great for the job? We offer a referral just get in touch!
Note: We do our best to respond to every application, but due to volume, we can't always guarantee it. If you haven't heard back within 7 days, unfortunately, you haven't been successful this time. Keep an eye on our site for new opportunities!