Register
|
Login
MENU
Home
Employers
Post Jobs
Employer Services
Employer Membership
Search Resumes
Job Seekers
Search Jobs
Post Resume
Job Seeker Services
Employer Lists
Register
Login
Register
|
Login
Member of Technical Staff, Backend Systems
Inception
Palo Alto, CA
Category
Information Technology
Apply for Job
Job Description
Inception creates the world’s fastest, most efficient AI models. We are seeking experienced engineers to own the systems that serve our diffusion LLMs in production, optimizing for latency, throughput, cost, and reliability.
Requirements
Design, build, and operate scalable model serving infrastructure for our diffusion LLMs
Optimize inference pipelines for latency, throughput, and cost efficiency across GPU hardware
Implement and manage load balancing, autoscaling, and traffic routing for model endpoints
Build systems for model versioning, canary deployments, and zero-downtime rollouts
Develop monitoring, alerting, and observability tooling to ensure SLA compliance and rapid incident response
Collaborate with ML researchers to translate model advances into production-ready serving improvements
Benchmark and evaluate serving frameworks and hardware configurations to inform infrastructure decisions
Benefits
Competitive salary
Equity in a rapidly growing startup
Access to the latest GPU hardware and cloud resources
Flexible vacation and paid time off (PTO)
Health, dental, and vision insurance
A collaborative and inclusive culture
]]>