Software Engineer, Data Infrastructure & Acquisition

Speechify

Tucson, AZ

Category Other-View Description

Remote

Job Description

Role Overview

We're looking to hire for our Data side of our AI team at Speechify. This role is responsible for all aspects of data collection to support our model training operations. We are able to build high-quality datasets at petabyte-scale and low cost through a tight integration of infrastructure, engineering, and research work.

What You Will Do

Be scrappy to find new sources of audio data and bring it into our ingestion pipeline, operate and extend the cloud infrastructure for our ingestion pipeline, collaborate with scientists to shift the cost/throughput/quality frontier, and collaborate with others on the AI Team and Speechify Leadership.

Why It Might Be a Fit

An ideal candidate should have a BS/MS/PhD in Computer Science or a related field, 5+ years of industry experience in software development, proficiency with bash/Python scripting in Linux environments, and proficiency in Docker and Infrastructure-as-Code concepts.

Requirements

BS/MS/PhD in Computer Science or a related field
5+ years of industry experience in software development
Proficiency with bash/Python scripting in Linux environments
Proficiency in Docker and Infrastructure-as-Code concepts
Professional experience with at least one major Cloud Provider (we use GCP)
Experience with web crawlers, large-scale data processing workflows
Ability to handle multiple tasks and adapt to changing priorities
Strong communication skills, both written and verbal

Benefits

Competitive salaries
Friendly and laid-back atmosphere
Commitment to building a great asynchronous culture
Opportunity to work on a life-changing product
Build products that directly impact and support people with learning differences
Work in one of the fastest-growing sectors of tech

]]>