
GPU Kernel Developer - AI Models
- Cambridge
- Permanent
- Full-time
- Develop high performance GPU kernels for key AI operators on AMD GPUs
- Optimize GPU code using structured and disciplined methodology - profiling to identify gaps, roofline-analysis on hardware, identify key set of optimizations, establish uplift and line-of-sight, prototype and develop optimizations
- Support mission-critical workloads in NLP/LLM, Recommendation, Vision and Audio
- Collaborate and interact with system level performance architects, GPU hardware specialists, power/clock tuning teams, performance validation teams, and performance marketing teams to analyze and optimize training and inference for AI
- Work with open-source framework maintainers to understand their requirements – and have your code changes integrated upstream
- Debug, maintain and optimize GPU kernels, understand and drive AI operator performance (GEMM, Attention, Distributed scale-up/out communication, etc.)
- Apply your knowledge of software engineering best practices
- Knowledge of GPU computing (HIP, CUDA, OpenCL, Triton)
- Knowledge and experience in optimizing GPU kernels
- Expertise in using profiling, debugging tools
- Core understanding of GPU hardware
- Excellent C/C++/Python programming and software design skills, including debugging, performance analysis, and test design.
- Masters or PhD or equivalent experience in Computer Science, Computer Engineering, or related field