Site Reliability Engineer, Search Infrastructure - USDS

TikTok
Seattle, WA
Job Description
The USDS TikTok Search Infra SRE team works with engineering and product teams to build and run large-scale, globally distributed, observable, fault-tolerant systems. SREs on this team will deliver on production ownership and be responsible for observability and automation across complex, large-scale service mesh architectures.

Requirements

  • Engage in and improve the whole lifecycle of Search systems — from system design consulting through to launch reviews, deployment, operation and refinement
  • Deliver tools/software to improve the reliability and scalability of services, automate operations and improve R&D efficiency
  • Build availability of large-scale services deployed across global data centers
  • Plan, manage and optimize cloud resources utilization, ensuring SLA of large-scale clusters
  • Measure and monitor availability, latency and overall service health
  • Practice sustainable incident response and postmortems

Benefits

  • Medical, dental, and vision insurance
  • 401(k) savings plan with company match
  • Paid parental leave
  • Short-term and long-term disability coverage
  • Life insurance
  • Wellbeing benefits
  • 10 paid holidays per year
  • 10 paid sick days per year
  • 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)
]]>