Staff AI Engineer, Model Post-Training and Alignment

OKX

San Jose, CA

Category Research

Job Description

We are seeking a highly skilled and hands-on Machine Learning Engineer specializing in large model post-training and alignment to design, execute, and optimize post-training pipelines to improve model performance, controllability, domain adaptation, and reasoning capabilities.

Requirements

Bachelor's in Computer Science, AI, Machine Learning, or related fields with at least 8 years of industry experience.
Strong hands-on experience across the full post-training pipeline for large models.
Deep familiarity with preference learning and alignment techniques, including DPO, GRPO, and RL-based post-training methodologies.
Proven experience designing domain-specific data strategies and training methodologies.
Experience training and post-training specialized small models from scratch.
Solid understanding of reinforcement learning fundamentals and their application to model alignment.
Experience deploying models in low-latency production environments using frameworks such as vLLM, SGLang, or similar.

Benefits

Competitive total compensation package
L&D programs and Education subsidy for employees' growth and development
Various team building programs and company events
Wellness and meal allowances
Comprehensive healthcare schemes for employees and dependants

]]>