Research Engineer, Frontier Speculative Decoding

Together AI

New York, CA

Category Software Engineering

Remote

Job Description

We're building the Inference Platform that powers the world's most advanced generative AI models. This role will focus on translating our internal model training research to production-ready deployment for customers, fine-tuning models on customer data, and creating highly efficient, specialized models. We're looking for a Research Engineer to design and iterate on novel speculative algorithms, combining architectural innovations with curated data to push the frontier of accuracy–efficiency tradeoffs.

Requirements

Design and iterate on novel speculative algorithms
Be the critical link between raw data and a production-ready model
Collaborate with customers to understand their needs and work closely with core inference and Applied ML research teams
Experience with Python and PyTorch
Familiarity with SLURM and/or Kubernetes clusters
Familiarity with modern LLMs and generative models
Basic understanding of distributed training frameworks
Bachelor’s, Master’s degree, or Ph.D. in Computer Science, Computer Engineering, or a related field

Benefits

Competitive compensation
Startup equity
Health insurance
Other competitive benefits

]]>