Research Engineer, Frontier Speculative Decoding

Together AI
New York, CA
Remote
Job Description
We're building the Inference Platform that powers the world's most advanced generative AI models. This role will focus on translating our internal model training research to production-ready deployment for customers, fine-tuning models on customer data, and creating highly efficient, specialized models. We're looking for a Research Engineer to design and iterate on novel speculative algorithms, combining architectural innovations with curated data to push the frontier of accuracy–efficiency tradeoffs.

Requirements

  • Design and iterate on novel speculative algorithms
  • Be the critical link between raw data and a production-ready model
  • Collaborate with customers to understand their needs and work closely with core inference and Applied ML research teams
  • Experience with Python and PyTorch
  • Familiarity with SLURM and/or Kubernetes clusters
  • Familiarity with modern LLMs and generative models
  • Basic understanding of distributed training frameworks
  • Bachelor’s, Master’s degree, or Ph.D. in Computer Science, Computer Engineering, or a related field

Benefits

  • Competitive compensation
  • Startup equity
  • Health insurance
  • Other competitive benefits
]]>