Incident Response Manager - Data Center

TikTok
San Jose, CA
Job Description
We are seeking a technically skilled and detail-oriented professional to serve as a front-line responder for incident detection, triage, and response across infrastructure, facilities, and security operations. The ideal candidate will possess a solid foundation in IT, infrastructure, or engineering disciplines, with experience in critical environments and the ability to analyze incidents, identify patterns, and drive long-term improvements.

Requirements

  • Serve as the first responder in the IRC Operation Center, detecting and responding to events across infrastructure, facilities using tools such as Server Automation, Data Center Infrastructure Management, Network monitoring, Grafana, and related systems.
  • Respond promptly to events including but not limited to:
  • Conduct detailed investigations to diagnose the root cause of events, assess their impact, and determine appropriate response actions.
  • Monitor and analyze detected events, accurately classify incidents based on potential or actual customer impact, and proactively communicate risks.
  • Draft detailed incident reports and conduct post-mortem reviews to document lessons learned.
  • Generate regular reports to deliver comprehensive insights into the effectiveness of incident response and recovery processes.
  • Analyze trends and patterns in events to identify opportunities for improvement and optimization
  • Own and drive the Incident, Problem, and Change Management processes in alignment with ITIL or internal ITSM frameworks.
  • Develop and maintain a comprehensive library of Standard Operating Procedures (SOPs), Methods of Procedure (MOPs), runbooks, and operational guides to ensure consistency and readiness across teams.
  • Lead or support continuous improvement projects aimed at enhancing incident response capabilities, operational security, system reliability, and overall infrastructure performance.
  • Provide technical and operational leadership to the incident response center team, ensuring consistent performance and adherence to best practices.

Benefits

  • Medical, dental, and vision insurance
  • 401(k) matching
  • Paid time off
  • Holiday pay
  • Life insurance
  • Disability insurance
]]>