Data validation and pipeline testing utilities for ML training projects. Validates datasets, model checkpoints, training pipelines, and dependencies. Use when validating training data, checking model outputs, testing ML pipelines, verifying dependencies, debugging training failures, or ensuring data quality before training.
View on GitHubFebruary 1, 2026
Select agents to install to:
npx add-skill https://github.com/vanman2024/ai-dev-marketplace/blob/main/plugins/ml-training/skills/validation-scripts/SKILL.md -a claude-code --skill validation-scriptsInstallation paths:
.claude/skills/validation-scripts/# ML Training Validation Scripts **Purpose:** Production-ready validation and testing utilities for ML training workflows. Ensures data quality, model integrity, pipeline correctness, and dependency availability before and during training. **Activation Triggers:** - Validating training datasets before fine-tuning - Checking model checkpoints and outputs - Testing end-to-end training pipelines - Verifying system dependencies and GPU availability - Debugging training failures or data issues - Ensuring data format compliance - Validating model configurations - Testing inference pipelines **Key Resources:** - `scripts/validate-data.sh` - Comprehensive dataset validation - `scripts/validate-model.sh` - Model checkpoint and config validation - `scripts/test-pipeline.sh` - End-to-end pipeline testing - `scripts/check-dependencies.sh` - System and package dependency checking - `templates/test-config.yaml` - Pipeline test configuration template - `templates/validation-schema.json` - Data validation schema template - `examples/data-validation-example.md` - Dataset validation workflow - `examples/pipeline-testing-example.md` - Complete pipeline testing guide ## Quick Start ### Validate Training Data ```bash bash scripts/validate-data.sh \ --data-path ./data/train.jsonl \ --format jsonl \ --schema templates/validation-schema.json ``` ### Validate Model Checkpoint ```bash bash scripts/validate-model.sh \ --model-path ./checkpoints/epoch-3 \ --framework pytorch \ --check-weights ``` ### Test Complete Pipeline ```bash bash scripts/test-pipeline.sh \ --config templates/test-config.yaml \ --data ./data/sample.jsonl \ --verbose ``` ### Check System Dependencies ```bash bash scripts/check-dependencies.sh \ --framework pytorch \ --gpu-required \ --min-vram 16 ``` ## Validation Scripts ### 1. Data Validation (validate-data.sh) **Purpose:** Validate training datasets for format compliance, data quality, and schema conformance. **Usage:** ```bash bash s