Use when creating structured operational runbooks for human operators. Covers runbook organization, documentation patterns, and best practices for clear operational procedures.
View on GitHubFebruary 5, 2026
Select agents to install to:
npx add-skill https://github.com/TheBushidoCollective/han/blob/main/plugins/specialized/runbooks/skills/runbook-structure/SKILL.md -a claude-code --skill runbooks-structureInstallation paths:
.claude/skills/runbooks-structure/# Runbooks - Structure Creating clear, actionable runbooks for operational tasks, maintenance, and troubleshooting. ## What is a Runbook? A runbook is step-by-step documentation for operational tasks: - **Troubleshooting** - Diagnosing and fixing issues - **Incident Response** - Handling production incidents - **Maintenance** - Routine operational tasks - **On-Call** - Reference for on-call engineers ## Basic Runbook Structure ### Minimum Viable Runbook ```markdown # Service Name: Task/Issue ## Overview Brief description of what this runbook covers. ## Prerequisites - Required access/permissions - Tools needed - Knowledge required ## Steps ### 1. First Step Detailed instructions for first action. ### 2. Second Step Detailed instructions for second action. ## Validation How to verify the task was completed successfully. ## Rollback (if applicable) How to undo the changes if needed. ``` ## Comprehensive Runbook Template ```markdown # [Service]: [Task/Issue Title] **Last Updated:** 2025-01-15 **Owner:** Platform Team **Severity:** High/Medium/Low **Estimated Time:** 15 minutes ## Overview Brief description of the problem or task this runbook addresses. ## When to Use This Runbook - Alert fired: `high_cpu_usage` - Customer report: slow response times - Scheduled maintenance window ## Prerequisites - [ ] VPN access to production network - [ ] AWS console access (read/write) - [ ] kubectl configured for production cluster - [ ] Slack access to #incidents channel ## Context ### Architecture Overview Brief explanation of relevant system architecture. ### Common Causes - Database connection pool exhaustion - Memory leaks in worker processes - Third-party API rate limiting ## Diagnosis Steps ### 1. Check System Health ```bash # Check pod status kubectl get pods -n production # Expected output: All pods Running ``` **Decision Point:** If pods are CrashLooping, proceed to step 2. Otherwise, skip to step 3. ### 2. Check Application Logs ```bash #
Issues Found: