
Staff AI Engineer
- London
- Permanent
- Full-time
- Your title will be Staff AI Engineer
- This role pays £170k + a year + generous stock options
- Matt, Cofounder and CTO, is the hiring manager
- We work Mon-Thu in our office in Chancery Lane, London, Fri from anywhere
- Context: To predict the next email someone will send, you need to deeply understand their work. We've built battle tested, production grade integrations with our users' emails, meetings, documentation and messaging (slack etc) to do this. And have fine tuned models that read these and decide what makes it into our knowledge base.
- Feedback loop: No thumbs up thumbs down. We've developed an incredibly strong feedback loop that tells us whether a change we've made to our system works. Down to the word, and even character.
- Tools: There's lots of hype around generative AI. Different models coming out every day, new techniques. We've sorted the signal from the noise by experimenting with tools in production, to find what actually works. As a result, we know which base models to use, when and how to fine tune, when to use workflows vs agents, how to build the best RAG pipeline, etc. And we're refining our learning every day.
- Jumping on calls with customers, and getting them to narrate their thoughts to you as they work through your email, giving you ideas for what we should build into the system to improve our performance
- Using the above to define what models we need to build, and how they should fit within the wider system
- Designing strategies to validate and troubleshoot the accuracy of each model step you design, as well as the accuracy of the whole system, and feedback the learnings to the product engineering team, and human data division. Whether using LLM evals or deterministic code.
- Working with product reliability engineers to improve the quality of our raw email, meeting, document and messaging data
- Architecting the structure of our knowledge base, how we save down to it, and how we retrieve from it
- Working with our human data team to get them to produce training and validation sets for those models, and checking data quality
- Working out where we need human data vs synthetic data
- You're truly obsessed with harnessing LLMs, agentic systems, and retrieval to achieve real world results, and have spent disproportionate time improving your understanding over the last 3 years, both in your day job, and in side projects.
- You have an expert level understanding of Typescript or Python, so that you can implement your ideas
- You have experience architecting and evaluating generative AI systems, working with some of the following:
- LLM evals
- Deterministic code based evals
- Vector database indexing and retrieval
- Fine tuning (supervised and direct preference optimisation)
- Interaction with human data teams
- Experimentation frameworks
- Urgency and intensity in your work. This is very hard problem, and will require disproportionate effort from you.
- Proactivity. This isn't a role where you'll be handed tickets. You'll be deciding how our AI system for predicting email replies develops alongside our CTO, with scope to change pretty much anything. You'll only be successful if you gain an understanding of the problem at a deep level by talking to customers and looking at individual data points, then continually update your knowledge on cutting edge approaches to applied generative AI.
- Typescript for all production code
- Pinecone as our vector database
- APIs into the models of all major providers (OpenAI, Gemini, Llama, Anthropic) for both fine tuning and inference
- Our custom human data and ML ops platform
- Vercel's agents SDK
- Firestore as our production database
- Firebase Auth as our auth system
- Backend deployed on Firebase Functions, and making use of PubSub and Cloud Storage
- BigQuery as our data warehouse
- Sentry and Google Cloud Logging for monitoring
- Github Actions for CI/CD
- An initial call with someone from the Fyxer AI team to review your experience and motivation for joining (15 mins)
- A call with our CTO to discuss your approach and experience (45 minutes)
- Live coding, remote (60 minutes)
- Meet team in office (60 minutes)