Senior ML Engineer (Token Factory)
jobgether
UK
Full-time
58 more jobs in UK.
Upload your CV and see which ones actually match you.
Accountabilities:
- Drive inference optimization efforts by identifying bottlenecks and implementing performance improvements across diverse LLM architectures, improving throughput and reducing latency and cost per token.
- Contribute to the design and evolution of inference engines, including techniques such as speculative decoding, KV-cache optimization, and support for dense and MoE models.
- Develop and productionize low-precision training and inference pipelines (e.g., FP8, MXFP4) to maximize efficiency on large GPU clusters.
- Profile and analyze GPU workloads using modern tooling to identify performance constraints and guide architectural improvements.
- Collaborate on scalable distributed training and inference systems, including sharding strategies, custom kernels, and hardware-aware optimizations.
- Contribute to engineering best practices including testing, CI/CD, and maintainable production-grade ML systems.
Requirements
- Strong understanding of machine learning fundamentals, particularly transformer architectures and large language models.
- Hands-on experience profiling and optimizing GPU workloads using tools such as Nsight or PyTorch Profiler.
- Deep knowledge of GPU architecture, including memory hierarchy and compute vs. memory trade-offs.
- Familiarity with key LLM concepts such as attention mechanisms, RoPE, KV-cache, Flash Attention, and quantization techniques.
- Experience with large-scale deep learning training, including distributed systems, sharding strategies, and custom kernel development.
- Strong software engineering skills, with advanced proficiency in Python and modern ML frameworks.
- Solid understanding of software engineering practices such as version control, CI/CD pipelines, and unit testing.
- Strong communication skills with the ability to collaborate effectively in highly technical, cross-functional teams.
- Competitive compensation package
- Strong career development and continuous learning opportunities
- Flexible work environment with high autonomy and ownership
- Collaborative, innovation-driven engineering culture
- Opportunity to work on frontier AI systems at massive scale
- International, highly skilled, and diverse team environment
This listing is from ats_lever. View original listing ↗