Data Center Operations Systems Engineer - Atlanta

Lambda
Atlanta, GA
Job Description
Lambda, The Superintelligence Cloud, builds Gigawatt-scale AI Factories for Training and Inference. The Operations team is responsible for ensuring the seamless end-to-end execution of our AI-IaaS infrastructure and hardware, including sourcing infrastructure, managing day-to-day data center operations, and driving cross company coordination.

Requirements

  • Ensure new server, storage and network infrastructure is properly racked, labeled, cabled, and configured
  • Document data center layout and network topology in DCIM software
  • Work with supply chain & manufacturing teams to ensure timely deployment of systems and project plans for large-scale deployments
  • Participate in data center capacity and roadmap planning with sales and customer success teams to allocate floorspace
  • Assess current and future state data center requirements based on growth plans and technology trends
  • Manage a parts depot inventory and track equipment through the delivery-store-stage-deploy-handoff process in each of our data centers
  • Work closely with HW Support team to ensure data center infrastructure-related support tickets are resolved
  • Work with RMA team to ensure faulty parts are returned and replacements are ordered
  • Create installation standards and documentation for placement, labeling, and cabling to drive consistency and discoverability across all data centers
  • Serve as a subject-matter expert on data center deployments as part of sales engagement for large-scale deployments in our data centers and at customer sites

Benefits

  • Health, dental, and vision coverage for you and your dependents
  • Wellness and Commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible Paid Time Off Plan
]]>