Back to Skills

analyzing-test-effectiveness

verified

Use to audit test quality with Google Fellow SRE scrutiny - identifies tautological tests, coverage gaming, weak assertions, missing corner cases. Creates bd epic with tasks for improvements, then runs SRE task refinement on each.

View on GitHub

Marketplace

withzombies-hyper

withzombies/hyperpowers

Plugin

withzombies-hyper

Repository

withzombies/hyperpowers
25stars

skills/analyzing-test-effectiveness/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/withzombies/hyperpowers/blob/main/skills/analyzing-test-effectiveness/SKILL.md -a claude-code --skill analyzing-test-effectiveness

Installation paths:

Claude
.claude/skills/analyzing-test-effectiveness/
Powered by add-skill CLI

Instructions

<skill_overview>
Audit test suites for real effectiveness, not vanity metrics. Identify tests that provide false confidence (tautological, mock-testing, line hitters) and missing corner cases. Create bd epic with tracked tasks for improvements. Run SRE task refinement on each task before execution.

**CRITICAL MINDSET: Assume tests were written by junior engineers optimizing for coverage metrics.** Default to skeptical—a test is RED or YELLOW until proven GREEN. You MUST read production code before categorizing tests. GREEN is the exception, not the rule.
</skill_overview>

<rigidity_level>
MEDIUM FREEDOM - Follow the 5-phase analysis process exactly. Categorization criteria (RED/YELLOW/GREEN) are rigid. Corner case discovery adapts to the specific codebase. Output format is flexible but must include all sections.
</rigidity_level>

<quick_reference>
| Phase | Action | Output |
|-------|--------|--------|
| 1. Inventory | List all test files and functions | Test catalog |
| 2. Read Production Code | Read the actual code each test claims to test | Context for analysis |
| 3. Trace Call Paths | Verify tests exercise production, not mocks/utilities | Call path verification |
| 4. Categorize (Skeptical) | Apply RED/YELLOW/GREEN - default to harsher rating | Categorized tests |
| 5. Self-Review | Challenge every GREEN - would a senior SRE agree? | Validated categories |
| 6. Corner Cases | Identify missing edge cases per module | Gap analysis |
| 7. Prioritize | Rank by business criticality | Priority matrix |
| 8. bd Issues | Create epic + tasks, run SRE refinement | Tracked improvement plan |

**MANDATORY: Read production code BEFORE categorizing tests. You cannot assess a test without understanding what it claims to test.**

**Core Questions for Each Test:**
1. What bug would this catch? (If you can't name one → RED)
2. Does it exercise PRODUCTION code or a mock/test utility? (Mock → RED or YELLOW)
3. Could code break while test passes? (If yes → YELLOW or RED)
4. Mea

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
36506 chars