Register
|
Login
MENU
Home
Employers
Post Jobs
Employer Services
Employer Membership
Search Resumes
Job Seekers
Search Jobs
Post Resume
Job Seeker Services
Employer Lists
Register
Login
Register
|
Login
Senior Site Reliability Engineer (SRE) - (Dublin, CA)
Articul8
Dublin, CA
Category
Engineering
Apply for Job
Job Description
Articul8 AI is seeking a Senior Site Reliability Engineer (SRE) to join their team and help ensure the reliability, performance, and scalability of their GenAI SaaS platform.
Requirements
Architect and maintain scalable, highly available infrastructure for our GenAI platform.
Design and implement robust monitoring, alerting, and observability solutions to proactively ensure system health and performance.
Automate deployment, scaling, and management of our cloud-native infrastructure, reducing toil and improving efficiency.
Define, measure, and improve Service Level Objectives (SLOs) and Service Level Indicators (SLIs) to deliver outstanding service quality.
Participate in on-call rotations and provide rapid response to production incidents, minimizing downtime and user impact.
Collaborate closely with development teams to build reliable, scalable, and efficient systems for complex AI workloads.
Lead incident response efforts, conduct thorough post-mortems, and champion continuous improvement initiatives.
Optimize infrastructure for performance, scalability, and cost-effectiveness—especially for high-demand AI workloads.
Implement and enforce security best practices across all systems and environments.
Create and maintain comprehensive documentation, including runbooks and knowledge base articles, to foster a culture of shared knowledge.
]]>