Inference Runtime, Engineering Manager

OpenAI
San Francisco, CA
Category Research
Job Description
We are looking for an engineering leader who wants to build and lead the worlds leading AI systems and modeling engineers who take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment. As an Engineering Manager, you will lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team, and work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production.

Requirements

  • Lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team
  • Work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production
  • Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack
  • Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues
  • Optimize our code and fleet of GPU's to utilize every FLOP and every GB of GPU RAM of our hardware

Benefits

  • Competitive salary
  • Stock options
  • Generous paid time off
  • Retirement plan
  • Flexible work hours
]]>