Register
|
Login
MENU
Home
Employers
Post Jobs
Employer Services
Employer Membership
Search Resumes
Job Seekers
Search Jobs
Post Resume
Job Seeker Services
Employer Lists
Register
Login
Register
|
Login
Member of Technical Staff, Inference & Serving Infra
Inception
Palo Alto, CA
Category
Engineering
Apply for Job
Job Description
We are looking for engineers and scientists to design, optimize, and scale the systems that power our diffusion LLMs. Your work will make inference faster, more cost-effective, and more reliable.
Requirements
Build and optimize high-performance model serving systems for low-latency inference for diffusion LLMs
Extend orchestration frameworks (e.g., Kubernetes, Ray, SLURM) for distributed inference, evaluation, and large-batch serving.
Collaborate with ML researchers to translate theoretical requirements into practical system designs
Benefits
Competitive salary
Equity in a rapidly growing startup
Access to the latest GPU hardware and cloud resources
Flexible vacation and paid time off (PTO)
Health, dental, and vision insurance
A collaborative and inclusive culture
]]>