Register
|
Login
MENU
Home
Employers
Post Jobs
Employer Services
Employer Membership
Search Resumes
Job Seekers
Search Jobs
Post Resume
Job Seeker Services
Employer Lists
Register
Login
Register
|
Login
CUDA Kernel Engineer
PRAGMATIKE
Cambridge, NY
Category
Research
Apply for Job
Job Description
We are searching for a CUDA Kernel Engineer who has hands-on experience developing and optimizing NVIDIA CUDA kernels from scratch. You will work on the GPU performance layer powering large-scale, high-throughput AI systems used by Fortune 500 customers.
Requirements
Design, implement, and optimize custom CUDA kernels for NVIDIA GPUs, with a focus on maximizing occupancy, memory throughput, and warp efficiency.
Profile GPU workloads using tools such as Nsight Compute, Nsight Systems, nvprof, and CUDA‐MEMCHECK.
Analyze and eliminate performance bottlenecks including warp divergence, uncoalesced memory access, register pressure, and PCIe transfer overhead.
Improve GPU memory pipelines (global, shared, L2, texture memory) and ensure proper memory coalescing.
Collaborate closely with AI systems, model acceleration, and backend distributed systems teams.
Contribute to GPU architecture decisions, kernel libraries, and internal performance-engineering best practices.
Benefits
Competitive salary & equity options
Sign-on bonus
Health, Dental, and Vision
401k
]]>