By Sahil Kapoor - 07 Sep 2025

SLO

A Service Level Objective (SLO) is an internal target for a service's reliability, expressed as a percentage over a window (for example, 99.9% of requests succeed within 300 ms over a rolling 28-day window). SLOs anchor the practice of Site Reliability Engineering by making reliability concrete, measurable, and tradable against feature velocity.

SLI (Service Level Indicator). The actual measurement, for example the ratio of successful requests to total requests over a window.
SLO (Service Level Objective). The target the SLI must hit, set by the team responsible for the service.
SLA (Service Level Agreement). A contractual commitment to customers, usually weaker than the internal SLO so there is a safety margin.

Error budgets

The complement of an SLO is the error budget: the amount of failure permitted by the SLO. A 99.9% availability SLO leaves a 0.1% error budget, roughly 43 minutes per month. The error budget is a quota: while it has room, the team can ship faster and take more risk; once depleted, the team prioritises reliability work over features. This frames reliability as a continuous business decision rather than a binary up/down state.

Common tooling

Nobl9, Cortex SLO, Sloth (open source), Datadog SLOs, New Relic SLM, Grafana SLO

🔗

Related Terms
SLI, Observability, Metrics, Prometheus, Grafana.

Related terms in the SLI/SLO/SLA family

Error budgets

Common tooling

Subscribe to Sahil's Playbook