Use when defining SLOs, selecting SLIs, or implementing error budget policies. Covers reliability targets, SLI selection, and error budget management.
View on GitHubmelodic-software/claude-code-plugins
systems-design
plugins/systems-design/skills/slo-sli-error-budget/SKILL.md
January 21, 2026
Select agents to install to:
npx add-skill https://github.com/melodic-software/claude-code-plugins/blob/main/plugins/systems-design/skills/slo-sli-error-budget/SKILL.md -a claude-code --skill slo-sli-error-budgetInstallation paths:
.claude/skills/slo-sli-error-budget/# SLOs, SLIs, and Error Budgets
Patterns and practices for defining service level objectives, selecting meaningful indicators, and managing reliability through error budgets.
## When to Use This Skill
- Defining SLOs for services
- Selecting appropriate SLIs
- Implementing error budget policies
- Balancing reliability and velocity
- Setting up SLO-based alerting
## Core Concepts
### SLI (Service Level Indicator)
```text
SLI = Quantitative measure of service level
What to measure:
- Availability: % of successful requests
- Latency: % of requests faster than threshold
- Throughput: Requests per second
- Error rate: % of failed requests
Formula:
SLI = (good events / total events) × 100%
Example:
Availability SLI = (successful requests / total requests) × 100%
= (99,500 / 100,000) × 100%
= 99.5%
```
### SLO (Service Level Objective)
```text
SLO = Target value for an SLI
Format: SLI >= Target over Time Window
Examples:
- 99.9% of requests successful over 30 days
- 95% of requests complete in <200ms over 7 days
- 99.95% availability measured monthly
Components:
┌─────────────────────────────────────────────────────┐
│ SLO = SLI + Target + Time Window │
│ │
│ "99.9% of HTTP requests return non-5xx │
│ over a rolling 30-day window" │
└─────────────────────────────────────────────────────┘
```
### Error Budget
```text
Error Budget = Allowed unreliability
If SLO = 99.9% availability:
Error Budget = 100% - 99.9% = 0.1%
Over 30 days:
Total minutes = 30 × 24 × 60 = 43,200
Error budget = 43,200 × 0.001 = 43.2 minutes
Or in requests (assuming 1M requests/month):
Error budget = 1,000,000 × 0.001 = 1,000 failed requests
Budget consumption:
┌────────────────────────────────────────────────────┐
│ Error Budget Remaining: 65% │
│ ████████████████████░░░░░░░░░░ │
│ Consumed: 35% (15 min of 43.2 m