
System Reliability Engineer
- Birmingham
- Permanent
- Full-time
- Being responsible for the ongoing stability and availability of systems.
- Monitoring, maintaining and supporting systems and software to ensure stability and compliance to technology standards.
- Performing maintenance, including installation of patches and upgrades with an understanding of the production environment and continuous availability targets.
- Full accountability of allocated project actions, technical problems & incidents.
- Event detection, resolution and/or escalation – conducts root cause analysis and remediation as required.
- Monitors systems availability, performance, capacity and trend against baseline metrics.
- System insights leveraging monitoring and event tools to support stabilization & lifecycle.
- Develops in depth knowledge of supported systems and applications and transfers knowledge to more junior staff.
- Vulnerability management by assessing security status and applying mitigation solutions
- Liaising with internal and external partners, including user groups, technical IT consultants and project managers.
- Demonstrable subject matter technical expertise across Microsoft technologies, virtualisation, Cloud technologies, networking and storage.
- Ensure technical documents and policies are up to date and accurate and adhered to.
- Work with and support the architecture and project delivery teams to deliver short, medium, and long-term continuous improvement initiatives across our infrastructure and delivered services, in alignment with our firm’s vision and ambition.
- Be actively involved in the Firms security posture and ensure that everything introduced is implemented securely.
- Through the firms comprehensive training programme, continuously develop your skills and support the development of more junior team members by providing mentorship and acting as a point of escalation for difficult issues.
- Work to resolve issues and incidents in a timely manner aligned with ITIL best practices.
- Provide technical peer review of proposed changes to be made to the estate.
- Actively participate in the Change Management Board meetings and manage change within the infrastructure team.
- Ensure that all services are appropriately licenced and assist with tracking and costing license and certificate renewals.
- Review logs, monitoring solutions, alerts, and other items to ensure that IT systems and services are available, performant and secure at all times
- Experienced in change management/scheduling
- Understanding of Infrastructure technology and protocols
- Microsoft solutions (Exchange, AD, SQL, VPN, DNS, DHCP) including cloud-based architecture (Azure, Exchange Online, Arc, Intune, Endpoint)
- Microsoft 365
- Experience in virtualisation technologies (VMware and Nutanix preferred)
- Windows Server Infrastructure
- HPE Server and Blade Infrastructure
- Networking (HPE ProCurve and FlexFabric)
- SAN technology (ideally Nimble, Alletra and 3PAR)
- Citrix Virtual Apps and Desktop experience with a good understanding of Citrix ADC.
- A good understanding of firewalls (ideally FortiGate)
- 5+ years of hands-on experience working as part of an infrastructure delivery team.
- Change management experience
- Strong communication skills. Can describe complex technical concepts/issues with non-technical stakeholders.
- Ability to quickly arbitrate, prioritise and filter change/deployment issues to identify solutions.
- Management of cloud-based solutions
- Management of server infrastructure solutions including storage.
- Exposure to Nutanix AHV and DR
- Rubrik backup solution