Register
|
Login
MENU
Home
Employers
Post Jobs
Employer Services
Employer Membership
Search Resumes
Job Seekers
Search Jobs
Post Resume
Job Seeker Services
Employer Lists
Register
Login
Register
|
Login
Lead Infrastructure and Reliability Engineer (Systems & Scale)
Luma AI
Palo Alto, CA
Category
Engineering
Apply for Job
Job Description
We are hiring a Lead Infrastructure and Reliability Engineer to define the direction of our Infrastructure Engineering team, a systems engineering group with company-level responsibility. The ideal candidate will have deep expertise in Linux and distributed systems, experience operating GPU / accelerator clusters in real production environments, and strong fluency in Kubernetes and modern open-source infrastructure.
Requirements
Deep expertise in Linux and distributed systems
Experience operating GPU / accelerator clusters in real production environments
Strong fluency in Kubernetes and modern open-source infrastructure
Comfortable debugging across hardware → kernel → runtime → orchestration
Ability to think in bottlenecks, failure modes, and tradeoffs
Benefits
Competitive salary
Equity
Stock options
Generous Paid Time Off
401k Matching
Retirement Plan
]]>