
Site Reliability Engineer - Storage Engineer
- United Kingdom
- Permanent
- Full-time
- Automate and maintain day-to-day operations of storage systems to support application demands.
- Develop and maintain tools and automation scripts to streamline storage operations and improve efficiency.
- Monitor system performance, identify issues, and implement solutions to ensure high availability and reliability.
- Participate in agile concepts such as daily stand-up meetings, task tracking boards, design and code reviews, automated testing, continuous integration, and deployment.
- Continuously improve system reliability, performance, and capacity through proactive monitoring, automation, and optimization.
- 2+ years of experience in site reliability engineering or a similar role.
- Proficiency in working with Ceph, including deployment, configuration, and management of Ceph clusters and systems.
- 1+ years of professional experience with Ceph
- Experience working on Linux/Unix systems, with a focus on automation and operating at scale.
- Proficiency in Python or Bash.
- Experience with Ansible, Terraform, or SaltStack.
- Experience with Nagios-based monitoring tools, such as Icinga2.
- Experience with observability tooling, such as Prometheus, Grafana, Mimir, and Loki.
- Solid understanding of core networking concepts and protocols, particularly in relation to Linux/Unix systems.
- Experience with Agile concepts and methodologies, including participation in Scrum or Kanban teams, and familiarity with Agile tools and practices.
- Demonstrates solid analytical and troubleshooting skills, with the ability to resolve moderately complex issues in distributed systems with guidance when needed.
- Communicates clearly and works well within a team environment, contributing to collaboration and knowledge sharing with guidance when needed.
- Experience with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Exposure to and experience working with compute platforms (e.g., OpenStack, AWS).
- Familiarity with ability to contribute to CI/CD pipelines and automation workflows.