Inference Runtime, Engineering Manager

OpenAI

San Francisco, CA

Category Research

Job Description

We are looking for an engineering leader who wants to build and lead the worlds leading AI systems and modeling engineers who take the world's largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment. As an Engineering Manager, you will lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team, and work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production.

Requirements

Lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team
Work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production
Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack
Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues
Optimize our code and fleet of GPU's to utilize every FLOP and every GB of GPU RAM of our hardware

Benefits

Competitive salary
Stock options
Generous paid time off
Retirement plan
Flexible work hours

]]>