Software Engineer, Fleet Management

OpenAI
San Francisco, CA
Job Description
The Fleet team at OpenAI supports the computing environment that powers our cutting-edge research and product development. We prioritize safety, reliability, and responsible AI deployment over unchecked growth.

Requirements

  • Design and build systems to manage both cloud and bare-metal fleets at scale.
  • Develop tools that integrate low-level hardware metrics with high-level job scheduling and cluster management algorithms.
  • Leverage LLMs to coordinate vendor operations and optimize infrastructure workflows.
  • Automate infrastructure processes, reducing repetitive toil and improving system reliability.
  • Collaborate with hardware, infrastructure, and research teams to ensure seamless integration across the stack.
  • Continuously improve tools, automation, processes, and documentation to enhance operational efficiency.

Benefits

  • Paid Time Off
  • 401k Matching
  • Retirement Plan
  • Relocation Assistance
]]>