Verify test quality by injecting mutations into code and measuring catch rate. Calculate mutation score (0.00-1.00) showing test effectiveness. Auto-generate missing tests to improve coverage. Integrate with Serena for continuous mutation tracking. Use when: improving test quality, validating test effectiveness, generating missing test cases, measuring code coverage gaps.
View on GitHubkrzemienski/shannon-framework
shannon
January 21, 2026
Select agents to install to:
npx add-skill https://github.com/krzemienski/shannon-framework/blob/main/skills/mutation-testing/SKILL.md -a claude-code --skill mutation-testingInstallation paths:
.claude/skills/mutation-testing/# Mutation Testing - Quantified Test Quality
## Purpose
Measure test effectiveness by injecting mutations (intentional bugs) into code and tracking how many your tests catch. Calculates mutation score (0.00-1.00) to quantify test coverage gaps. Auto-generates additional tests targeting uncaught mutations. Integrates with Serena MCP for continuous tracking and trend analysis.
## When to Use
- Measuring test effectiveness beyond code coverage
- Identifying weak test areas (low mutation scores)
- Auto-generating tests for mutation-resistant code
- Validating test quality gates (require 0.80+ mutation score)
- Tracking mutation improvements over time
- Comparing mutation scores across teams/projects
## Core Metrics
**Mutation Score Calculation:**
```
Score = (Killed Mutations / Total Mutations) × 1.0
Range: 0.00 (no tests catch bugs) to 1.00 (all bugs caught)
```
**Score Interpretation:**
- 0.90+ Excellent test suite, catches most bugs
- 0.80-0.89 Good coverage, minor gaps
- 0.70-0.79 Acceptable but needs improvement
- <0.70 Poor coverage, significant blind spots
## Workflow
### Phase 1: Mutation Generation & Execution
1. **Inject mutations**: Stryker/PIT/mutmut inject bugs
2. **Run tests**: Execute full test suite
3. **Track kills**: Count caught vs escaped mutations
4. **Calculate score**: Derive 0.00-1.00 metric
### Phase 2: Serena Integration
1. **Push metrics**: Send mutation_score, killed_count, escaped_count to Serena
2. **Track history**: Store scores by commit, branch, timestamp
3. **Alert on regression**: Flag if score drops >0.05
4. **Trend analysis**: Show mutation score trajectory
**Serena Push Example:**
```json
{
"metric_type": "mutation_score",
"project": "task-app",
"value": 0.87,
"components": {
"auth": 0.92,
"api": 0.85,
"ui": 0.79
},
"killed": 156,
"escaped": 23,
"timestamp": "2025-11-20T10:30:00Z"
}
```
### Phase 3: Gap Analysis & Test Generation
1. **Identify mutations**: List escaped mutations (bugs tests mi