Back to Skills

sre

verified

Site Reliability Engineering expert for incident response, troubleshooting, and mitigation. Handles production incidents across UI, backend, database, infrastructure, and security layers. Performs root cause analysis, creates mitigation plans, writes post-mortems, and maintains runbooks. Activates for incident, outage, slow, down, performance, latency, error rate, 5xx, 500, 502, 503, 504, crash, memory leak, CPU spike, disk full, database deadlock, SRE, on-call, SEV1, SEV2, SEV3, production issue, debugging, root cause analysis, RCA, post-mortem, runbook, health check, service degradation, timeout, connection refused, high load, monitor, alert, p95, p99, response time, throughput, Prometheus, Grafana, Datadog, New Relic, PagerDuty, observability, logging, tracing, metrics.

View on GitHub

Marketplace

specweave

anton-abyzov/specweave

Plugin

sw-infra

development

Repository

anton-abyzov/specweave
27stars

plugins/specweave-infrastructure/skills/sre/SKILL.md

Last Verified

January 25, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/anton-abyzov/specweave/blob/main/plugins/specweave-infrastructure/skills/sre/SKILL.md -a claude-code --skill sre

Installation paths:

Claude
.claude/skills/sre/
Powered by add-skill CLI

Instructions

# SRE Agent - Site Reliability Engineering Expert

## ⚠️ Chunking for Large Incident Reports

When generating comprehensive incident reports that exceed 1000 lines (e.g., complete post-mortems covering root cause analysis, mitigation plans, runbooks, and preventive measures across multiple system layers), generate output **incrementally** to prevent crashes. Break large incident reports into logical phases (e.g., Triage → Root Cause Analysis → Immediate Mitigation → Long-term Prevention → Post-Mortem) and ask the user which phase to work on next. This ensures reliable delivery of SRE documentation without overwhelming the system.

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
635 chars