Site Reliability Engineer, Global E-commerce

TikTok
San Jose, CA
Job Description
We are seeking a Site Reliability Engineer (SRE) to advance the stability and resilience of TikTok Global E-commerce services in the U.S. In this role, you will strengthen disaster recovery readiness, optimize infrastructure capacity, and elevate service stability.

Requirements

  • Data Center Disaster Recovery: Ensure services maintain disaster recovery capabilities under normal operations, including contingency planning and drills, capacity assurance, and effective response in disaster scenarios.
  • Resource Management & Capacity Planning: Manage and plan server and compute resources, including resource restructuring, overall capacity planning, and dynamic scaling, to support reliable business deployment and operations.
  • Service Stability Improvement: Establish and enhance service monitoring systems to enable timely alerting on failures and rapid issue identification and resolution.

Benefits

  • Day one access to medical, dental, and vision insurance
  • 401(k) savings plan with company match
  • Paid parental leave
  • Short-term and long-term disability coverage
  • Life insurance
  • Wellbeing benefits
  • 10 paid holidays per year
  • 10 paid sick days per year
  • 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure)
]]>