Create structured incident response runbooks with step-by-step procedures, escalation paths, and recovery actions. Use when building runbooks, responding to incidents, or establishing incident response procedures.
View on GitHubwshobson/agents
incident-response
January 19, 2026
Select agents to install to:
npx add-skill https://github.com/wshobson/agents/blob/main/plugins/incident-response/skills/incident-runbook-templates/SKILL.md -a claude-code --skill incident-runbook-templatesInstallation paths:
.claude/skills/incident-runbook-templates/# Incident Runbook Templates Production-ready templates for incident response runbooks covering detection, triage, mitigation, resolution, and communication. ## When to Use This Skill - Creating incident response procedures - Building service-specific runbooks - Establishing escalation paths - Documenting recovery procedures - Responding to active incidents - Onboarding on-call engineers ## Core Concepts ### 1. Incident Severity Levels | Severity | Impact | Response Time | Example | | -------- | -------------------------- | ----------------- | ----------------------- | | **SEV1** | Complete outage, data loss | 15 min | Production down | | **SEV2** | Major degradation | 30 min | Critical feature broken | | **SEV3** | Minor impact | 2 hours | Non-critical bug | | **SEV4** | Minimal impact | Next business day | Cosmetic issue | ### 2. Runbook Structure ``` 1. Overview & Impact 2. Detection & Alerts 3. Initial Triage 4. Mitigation Steps 5. Root Cause Investigation 6. Resolution Procedures 7. Verification & Rollback 8. Communication Templates 9. Escalation Matrix ``` ## Runbook Templates ### Template 1: Service Outage Runbook ````markdown # [Service Name] Outage Runbook ## Overview **Service**: Payment Processing Service **Owner**: Platform Team **Slack**: #payments-incidents **PagerDuty**: payments-oncall ## Impact Assessment - [ ] Which customers are affected? - [ ] What percentage of traffic is impacted? - [ ] Are there financial implications? - [ ] What's the blast radius? ## Detection ### Alerts - `payment_error_rate > 5%` (PagerDuty) - `payment_latency_p99 > 2s` (Slack) - `payment_success_rate < 95%` (PagerDuty) ### Dashboards - [Payment Service Dashboard](https://grafana/d/payments) - [Error Tracking](https://sentry.io/payments) - [Dependency Status](https://status.stripe.com) ## Initial Triage (First 5 Minutes) #