Root cause analysis for complex bugs. Use when initial fix fails, bug keeps returning, or incident is severe. Triggers on "root cause", "why does this keep happening", "trace the bug".
View on GitHubhope/skills/trace/SKILL.md
February 5, 2026
Select agents to install to:
npx add-skill https://github.com/saadshahd/moo.md/blob/main/hope/skills/trace/SKILL.md -a claude-code --skill traceInstallation paths:
.claude/skills/trace/# trace Root cause analysis when surface fixes fail. **Every incident is a learning opportunity.** We're going to systematically understand what happened — no blame, just understanding. Let's trace this together. ## When to Use | Trigger | Action | | ---------------------------------- | --------- | | Bug persists after initial fix | Run trace | | Production incident (SEV 1-2) | Run trace | | Complex failure (multiple sources) | Run trace | | Trivial bug (< 10 min fix) | Skip | ## 1. Timeline ``` HH:MM - [Event] - [System state] - [Action taken] ``` Note: detection time vs. start time (how long hidden?) ## 2. Five Whys ``` Effect: [What users/systems experienced] Why 1: [Immediate cause] (X-Y% confident) Why 2: [Underlying mechanism] (X-Y% confident) Why 3: [System/process gap] (X-Y% confident) Why 4: [Organizational factor] (X-Y% confident) Why 5: [Root cause] (X-Y% confident) ``` **Gates**: < 70% any level → request logs/reproduction Root cause = deepest answer with ≥70% confidence ## 3. Contributing Factors Beyond root cause, what amplified impact? **Process**: Monitoring gaps, testing gaps, review gaps **People**: Unclear ownership, missing runbooks **Technical**: Dependency failures, capacity limits, config drift **Context**: Traffic patterns, deployment timing, external factors List 2-4 factors with specific mechanisms. **Tools:** [Ishikawa](../soul/references/tools/ishikawa.md) for categorization, [Iceberg](../soul/references/tools/iceberg.md) for deep structure analysis. ## 4. Impact **Users**: Count, experience, business cost (quantified) **Systems**: Downstream effects, data integrity, security ## 5. Prevention Hierarchy ### Immediate (< 1 week) ``` 1. [Code/config change] File: [path:line] Verification: [how to confirm] Owner: [who] ``` ### Short-Term (< 1 month) ``` 1. [Alert/test/automation] Trigger: [when it fires] Owner: [who] ``` ### Long-Term (< 1 quarter) ```