Back to Skills

troubleshooting-health-checks

verified

Debugs and troubleshoots Mission Control health checks by analyzing check configurations, reviewing failure patterns, and identifying root causes. Use when users ask about failing health checks, mention specific health check names or IDs, inquire why a health check is failing or unhealthy, or need help understanding health check errors and timeouts.

View on GitHub

Marketplace

flanksource-mission-control

flanksource/claude-code-plugin

Plugin

mission-control-skills

Repository

flanksource/claude-code-plugin

skills/troubleshooting-health-checks/SKILL.md

Last Verified

January 21, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/flanksource/claude-code-plugin/blob/main/skills/troubleshooting-health-checks/SKILL.md -a claude-code --skill troubleshooting-health-checks

Installation paths:

Claude
.claude/skills/troubleshooting-health-checks/
Powered by add-skill CLI

Instructions

# Health Check Troubleshooting Skill

## Core Purpose

This skill enables Claude to troubleshoot Mission Control health checks by analyzing check configurations, diagnosing failure patterns, identifying timeout and error root causes, and recommending configuration adjustments to improve reliability.

Note: Read @skills/troubleshooting-health-checks/reference/query-syntax.md to for query syntax

## Health check troubleshooting workflow

Copy this checklist and track your progress:

```
Troubleshooting Progress:
- [ ] Step 1: Gather health check information
- [ ] Step 2: Analyze failure patterns
- [ ] Step 3: Cross-reference configuration issues
- [ ] Step 4: Create diagnostic summary
- [ ] Step 5: Verify remediation steps
```

## Gather health check information

To begin with, get the id of the check in question.
Use `search_health_checks` with query syntax to find checks. Read @skills/troubleshooting-health-checks/reference/query-syntax.md to for query syntax
Else, if you could not get the health check Id from the user provided name, use `list_all_checks` to get complete metadata for all health checks .

Then, follow this procedure:

- **Historical Context**: Use `get_check_status` to retrieve execution history
- **Investigate the check specification**: Understand the intention of the check.
- **Investiagte the chagnes to the canray**: Use `search_catalog_changes(<canary_uuid>)` to get the changes on the canary.
  Look for the change details to see any new changes on the specification.

## Analyze failure patterns

Examine the historical data to identify patterns. Look for:

- **Intermittent failures**: Passes sometimes, fails others
  - Suggests: Network instability, load-related issues, race conditions
- **Consistent failures**: Always failing
  - Suggests: Configuration error, endpoint down, authentication issue
- **Recent pattern changes**: Was passing, now failing
  - Suggests: Recent deployment, config change, infrastructure change
- **Timeout patterns**: Fail

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
3886 chars