Senior Machine Learning Engineer

TetraMem

Fremont, CA

Category Software Engineering

Job Description

Role Overview

Join TetraMem's global team to shape the future of AI computations and sustainable technology solutions while working at the forefront of innovation. As a Senior Machine Learning Engineer, you will develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.

What You Will Do

Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing. Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions. Work closely with hardware and software teams to integrate ML models into production systems.

Why It Might Be a Fit

You will have the opportunity to work with a talented team of engineers and industry-leading executives, drive innovation, and contribute to the overall system architecture. You will also have the chance to publish research findings, present at conferences, and contribute to open-source projects when applicable.

Requirements

5+ years of relevant industry experience (or a PhD) in Computer Science, Electrical Engineering, Machine Learning, or related fields
Prior experience managing a team, serving in a Team Lead role, or demonstrating strong technical leadership and cross-functional coordination capabilities
Strong hands-on experience in machine learning, with a focus on edge AI, on-device inference, and deploying lightweight models on resource-constrained devices
Expertise in modern ML frameworks such as PyTorch, TensorFlow (including TensorFlow Lite), and JAX
Proficiency in Python and C/C++, with practical experience in ML model optimization and production deployment
Deep experience with model quantization (PTQ/QAT), pruning, knowledge distillation, sparsity, and other compression techniques for efficient edge inference
Hands-on experience developing for or integrating with AI chip SDKs, neural accelerators (NPUs/DSPs), or hardware-specific toolchains (e.g., NVIDIA TensorRT, Qualcomm Neural Processing SDK, ARM Ethos, or similar)
Familiarity with edge inference runtimes (ONNX Runtime, ExecuTorch, TVM) and optimizing models for hardware constraints (latency, memory footprint, power consumption)

Benefits

Competitive salary range: $200,000 - $280,000 / year
Diversity and inclusion
Equal opportunity employer
Reasonable accommodations for qualified applicants with disabilities

]]>