Ceph Cluster Development Engineer (C++ Focus)

Fortinet
Santa Clara, CA
Job Description
We are seeking a highly skilled Ceph Cluster Development & Operations Engineer with strong expertise in C++ systems programming to design, extend, and maintain enterprise-scale Ceph distributed storage clusters.

Requirements

  • Design, build, and operate large-scale Ceph clusters including RADOS, RGW, RBD
  • Contribute to or extend Ceph core components written in C++ (e.g., OSD, RGW, librados, BlueStore, MGR modules)
  • Profile and optimize performance across network, disk I/O, and replication layers (PG placement, CRUSH rules, BlueStore tuning)
  • Develop automation and tooling for cluster lifecycle management (deployment, upgrades, scaling, failover, and recovery)
  • Integrate Ceph with Kubernetes (via Rook-Ceph, CSI drivers) and CI/CD pipelines for continuous delivery
  • Implement and validate multi-site replication and disaster recovery architectures for high availability
  • Develop and maintain secure storage solutions using dm-crypt, KMS integration, and CephX authentication
  • Build observability pipelines using Prometheus, Grafana, and custom exporters for metrics and health analytics
  • Write and maintain SOPs, automation scripts, and system documentation to support production-grade operations
  • Collaborate with upstream Ceph community or maintain in-house forks for feature development and bug fixes

Benefits

  • Medical, dental, vision, life and disability insurance
  • 401(k)
  • 11 paid holidays
  • Vacation time
  • Sick time
  • Comprehensive leave program
]]>