Software Engineer, Infrastructure Reliability

OpenAI
San Francisco, CA
Job Description
Join our team of Software Engineers to build and operate reliable and performant systems used across engineering. You'll be responsible for designing, building, and operating scalable and reliable systems, identifying and fixing performance bottlenecks, and improving automation to reduce manual work.

Requirements

  • 4+ years of relevant industry experience
  • 2+ years leading large scale, complex projects or teams as an engineer or tech lead
  • Strong proficiency in cloud infrastructure (like AWS, GCP, Azure) and IaC tools such as Terraform
  • Proficiency in programming / scripting languages
  • Experience with containerization technologies and container orchestration platforms like Kubernetes
  • Experience with observability tools such as Datadog, Prometheus, Grafana, Splunk and ELK stack
  • Experience with microservices architecture and service mesh technologies
  • Knowledge of security best practices in cloud environments
  • Strong understanding of distributed systems, networking, and database technologies
  • Excellent problem-solving skills and ability to work in a fast-paced environment

Benefits

  • Competitive salary
  • Equity in the company
  • Opportunities for professional growth and development
  • Collaborative and dynamic work environment
]]>