Execute Clay incident response procedures with triage, mitigation, and postmortem. Use when responding to Clay-related outages, investigating errors, or running post-incident reviews for Clay integration failures. Trigger with phrases like "clay incident", "clay outage", "clay down", "clay on-call", "clay emergency", "clay broken".
View on GitHubjeremylongshore/claude-code-plugins-plus-skills
clay-pack
plugins/saas-packs/clay-pack/skills/clay-incident-runbook/SKILL.md
February 1, 2026
Select agents to install to:
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/blob/main/plugins/saas-packs/clay-pack/skills/clay-incident-runbook/SKILL.md -a claude-code --skill clay-incident-runbookInstallation paths:
.claude/skills/clay-incident-runbook/# Clay Incident Runbook
## Overview
Rapid incident response procedures for Clay-related outages.
## Prerequisites
- Access to Clay dashboard and status page
- kubectl access to production cluster
- Prometheus/Grafana access
- Communication channels (Slack, PagerDuty)
## Severity Levels
| Level | Definition | Response Time | Examples |
|-------|------------|---------------|----------|
| P1 | Complete outage | < 15 min | Clay API unreachable |
| P2 | Degraded service | < 1 hour | High latency, partial failures |
| P3 | Minor impact | < 4 hours | Webhook delays, non-critical errors |
| P4 | No user impact | Next business day | Monitoring gaps |
## Quick Triage
```bash
# 1. Check Clay status
curl -s https://status.clay.com | jq
# 2. Check our integration health
curl -s https://api.yourapp.com/health | jq '.services.clay'
# 3. Check error rate (last 5 min)
curl -s localhost:9090/api/v1/query?query=rate(clay_errors_total[5m])
# 4. Recent error logs
kubectl logs -l app=clay-integration --since=5m | grep -i error | tail -20
```
## Decision Tree
```
Clay API returning errors?
├─ YES: Is status.clay.com showing incident?
│ ├─ YES → Wait for Clay to resolve. Enable fallback.
│ └─ NO → Our integration issue. Check credentials, config.
└─ NO: Is our service healthy?
├─ YES → Likely resolved or intermittent. Monitor.
└─ NO → Our infrastructure issue. Check pods, memory, network.
```
## Immediate Actions by Error Type
### 401/403 - Authentication
```bash
# Verify API key is set
kubectl get secret clay-secrets -o jsonpath='{.data.api-key}' | base64 -d
# Check if key was rotated
# → Verify in Clay dashboard
# Remediation: Update secret and restart pods
kubectl create secret generic clay-secrets --from-literal=api-key=NEW_KEY --dry-run=client -o yaml | kubectl apply -f -
kubectl rollout restart deployment/clay-integration
```
### 429 - Rate Limited
```bash
# Check rate limit headers
curl -v https://api.clay.com 2>&1 | grep -i rate
# Enable request queuing