Applied Safety Research Engineer, Safeguards

Anthropic
New York, CA
Category Engineering
Job Description
Anthropic is looking for a research-oriented engineer to develop methods for safety evaluations of AI models. The role sits at the intersection of applied ML research and engineering, requiring design of experiments to improve evaluation quality, research on model safety behavior, and collaboration with policy experts to translate real-world harm patterns into measurable evaluations.

Requirements

  • Design and run experiments to improve evaluation quality
  • Research how different factors impact model safety behavior
  • Analyze evaluation coverage to identify gaps and inform where we need better measurement
  • Productionize successful research into evaluation pipelines
  • Collaborate with Policy and Enforcement to translate real-world harm patterns into measurable evaluations
  • Build tooling that enables policy experts to create and iterate on evaluations
  • Surface findings to research and training teams to drive upstream model improvements

Benefits

  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Lovely office space
]]>