Design DAG-based MLOps pipeline architectures with Airflow, Dagster, Kubeflow, or Prefect. Activates for DAG orchestration, workflow automation, pipeline design patterns, CI/CD for ML. Use for platform-agnostic MLOps infrastructure - NOT for SpecWeave increment-based ML (use ml-pipeline-orchestrator instead).
View on GitHubanton-abyzov/specweave
sw-ml
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/anton-abyzov/specweave/blob/main/plugins/specweave-ml/skills/mlops-dag-builder/SKILL.md -a claude-code --skill mlops-dag-builderInstallation paths:
.claude/skills/mlops-dag-builder/# MLOps DAG Builder
Design and implement DAG-based ML pipeline architectures using production orchestration tools.
## Overview
This skill provides guidance for building **platform-agnostic MLOps pipelines** using DAG orchestrators (Airflow, Dagster, Kubeflow, Prefect). It focuses on workflow architecture, not SpecWeave integration.
**When to use this skill vs ml-pipeline-orchestrator:**
- **Use this skill**: General MLOps architecture, Airflow/Dagster DAGs, cloud ML platforms
- **Use ml-pipeline-orchestrator**: SpecWeave increment-based ML development with experiment tracking
## When to Use This Skill
- Designing DAG-based workflow orchestration (Airflow, Dagster, Kubeflow)
- Implementing platform-agnostic ML pipeline patterns
- Setting up CI/CD automation for ML training jobs
- Creating reusable pipeline templates for teams
- Integrating with cloud ML services (SageMaker, Vertex AI, Azure ML)
## What This Skill Provides
### Core Capabilities
1. **Pipeline Architecture**
- End-to-end workflow design
- DAG orchestration patterns (Airflow, Dagster, Kubeflow)
- Component dependencies and data flow
- Error handling and retry strategies
2. **Data Preparation**
- Data validation and quality checks
- Feature engineering pipelines
- Data versioning and lineage
- Train/validation/test splitting strategies
3. **Model Training**
- Training job orchestration
- Hyperparameter management
- Experiment tracking integration
- Distributed training patterns
4. **Model Validation**
- Validation frameworks and metrics
- A/B testing infrastructure
- Performance regression detection
- Model comparison workflows
5. **Deployment Automation**
- Model serving patterns
- Canary deployments
- Blue-green deployment strategies
- Rollback mechanisms
## Usage Patterns
### Basic Pipeline Setup
```python
# 1. Define pipeline stages
stages = [
"data_ingestion",
"data_validation",
"feature_engineering",
"model_t