Back to Skills

gameday-planning

verified

Use when planning GameDay exercises, designing failure scenarios, or conducting chaos drills. Covers GameDay preparation, execution, and follow-up.

View on GitHub

Marketplace

melodic-software

melodic-software/claude-code-plugins

Plugin

systems-design

Repository
Verified Org

melodic-software/claude-code-plugins
13stars

plugins/systems-design/skills/gameday-planning/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/melodic-software/claude-code-plugins/blob/main/plugins/systems-design/skills/gameday-planning/SKILL.md -a claude-code --skill gameday-planning

Installation paths:

Claude
.claude/skills/gameday-planning/
Powered by add-skill CLI

Instructions

# GameDay Planning

Comprehensive guide for planning and executing GameDay exercises - organized chaos drills that test system resilience and incident response.

## When to Use This Skill

- Planning GameDay exercises
- Designing failure scenarios
- Preparing teams for chaos experiments
- Running disaster recovery drills
- Improving incident response readiness

## What is a GameDay?

```text
GameDay = Planned chaos exercise for your systems

Like a fire drill, but for infrastructure:
- Scheduled in advance
- Controlled environment
- Practice for real incidents
- Learn and improve

Not chaos engineering:
- GameDay: Scheduled team exercise
- Chaos engineering: Continuous experiments

GameDays include:
- Failure injection
- Incident response practice
- Team coordination
- Runbook validation
```

## GameDay Types

### By Scope

```text
1. Component GameDay
   └── Single service or component
   └── Focused scenarios
   └── 2-4 hours

2. Service GameDay
   └── Multiple related services
   └── Integration scenarios
   └── Half day

3. Full System GameDay
   └── Complete system
   └── Disaster scenarios
   └── Full day

4. Cross-Team GameDay
   └── Multiple teams involved
   └── Complex scenarios
   └── 1-2 days
```

### By Objective

```text
1. Resilience validation
   └── Does the system handle failures?

2. Recovery practice
   └── Can we restore from backup?

3. Incident response training
   └── How well do we coordinate?

4. Runbook validation
   └── Do our runbooks work?

5. Capacity testing
   └── What happens under load?
```

## Planning Phase

### Timeline Overview

```text
Week -4: Initial planning
├── Define objectives
├── Identify stakeholders
└── Draft scenario ideas

Week -3: Scenario design
├── Detail failure scenarios
├── Define success criteria
└── Identify risks

Week -2: Preparation
├── Review with stakeholders
├── Prepare monitoring
├── Update runbooks
└── Brief participants

Week -1: Final prep
├── Confirm participants
├── Test monitoring
├── Walkthroug

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
11183 chars