
Principal Data Engineer
- Staines, Surrey
- Permanent
- Full-time
- Spotting high-value data opportunity within our IFS offerings, translating raw data into powerful features and reusable data assets.
- Serving as our data expert, guiding us towards the latest and greatest data technology and platform trends. You'll be the guru driving our data platform evolution and providing data project estimates.
- Leading the Data Engineering team in crafting and integrating data projects from the ground up. From framing problems and experimenting with new data sources and tools to the grand finale of data pipeline implementation and deployment. You will ensure scalability and top-tier performance.
- Locking arms with ML Engineers, Data Scientists, Architects, and Product/Program Managers. Together, you'll define, create, deploy, monitor, and document data pipelines to power advanced AI solutions.
- Becoming our data technology evangelist. Get ready to shine on the conference stage, host webinars, and pen compelling white papers and blogs. Share your discoveries with clients and internal stakeholders, offering actionable insights that drive change.
- Proficient in data pipelines across cloud/on-premises, using Azure and other technologies.
- Experienced in orchestrating data workflows and Kubernetes clusters on AKS using Airflow, Kubeflow, Argo, Dagster or similar.
- Skilled with data ingestion tools like Airbyte, Fivetran, etc. for diverse data sources.
- Expert in large-scale data processing with Spark or Dask.
- Strong in Python, Scala, C# or Java, cloud SDKs and APIs.
- AI/ML expertise for pipeline efficiency, familiar with TensorFlow, PyTorch, AutoML, Python/R, and MLOps (MLflow, Kubeflow).
- Solid in DevOps, CI/CD automation with Bitbucket Pipelines, Azure DevOps, GitHub.
- Automate deployment of data pipelines and applications using Bash, PowerShell, or Azure CLI, Terraform, Helm Chats etc.
- Experienced in leveraging Azure AI Search, Elasticsearch, MongoDB or other hybrid/vector stores for content analysis and indexing, with a focus on creating advanced RAG (Retrieval Augmented Generation) applications.
- Proficiency in building IoT data pipelines, encompassing real-time data ingestion, transformation, security, scalability, and seamless integration with IoT platforms.
- Design, develop, and monitor streaming data applications using Kafka and related technologies.