Observability Engineer (Prometheus / Grafana / Datadog)

Bright Vision Technologies
Naperville, IL
Remote
Job Description
Bright Vision Technologies is seeking an Observability Engineer to design and operate metrics, logging, tracing, and alerting platforms for engineering teams. The ideal candidate has experience with Prometheus, Grafana, and Datadog, and can translate noisy telemetry into actionable insight for engineers and business stakeholders.

Requirements

  • Design and operate enterprise-grade observability platforms
  • Architect Prometheus / Thanos / Mimir, Grafana, Loki, Tempo, OpenTelemetry, and Datadog deployments for high availability and scale
  • Develop standards for service instrumentation
  • Define and enforce SLOs, SLIs, and error budgets
  • Build alerting strategies that minimize noise and surface actionable signals
  • Operate large-scale time-series and log storage platforms
  • Design distributed tracing pipelines
  • Develop self-service tooling and paved-road libraries
  • Drive cost management and label-cardinality discipline across the observability estate
  • Lead incident response readiness improvements
  • Partner with SRE and platform teams to integrate observability into deployment pipelines
  • Evaluate and recommend observability vendors and open-source tools
  • Mentor engineering teams on observability fundamentals
  • Maintain documentation, onboarding guides, and runbooks for the observability platform

Benefits

  • Competitive base salary commensurate with experience
  • Benefits
]]>