Strategic guidance for operationalizing machine learning models from experimentation to production. Covers experiment tracking (MLflow, Weights & Biases), model registry and versioning, feature stores (Feast, Tecton), model serving patterns (Seldon, KServe, BentoML), ML pipeline orchestration (Kubeflow, Airflow), and model monitoring (drift detection, observability). Use when designing ML infrastructure, selecting MLOps platforms, implementing continuous training pipelines, or establishing model governance.
View on GitHubancoleman/ai-design-components
backend-ai-skills
February 1, 2026
Select agents to install to:
npx add-skill https://github.com/ancoleman/ai-design-components/blob/main/skills/implementing-mlops/SKILL.md -a claude-code --skill implementing-mlopsInstallation paths:
.claude/skills/implementing-mlops/# MLOps Patterns Operationalize machine learning models from experimentation to production deployment and monitoring. ## Purpose Provide strategic guidance for ML engineers and platform teams to build production-grade ML infrastructure. Cover the complete lifecycle: experiment tracking, model registry, feature stores, deployment patterns, pipeline orchestration, and monitoring. ## When to Use This Skill Use this skill when: - Designing MLOps infrastructure for production ML systems - Selecting experiment tracking platforms (MLflow, Weights & Biases, Neptune) - Implementing feature stores for online/offline feature serving - Choosing model serving solutions (Seldon Core, KServe, BentoML, TorchServe) - Building ML pipelines for training, evaluation, and deployment - Setting up model monitoring and drift detection - Establishing model governance and compliance frameworks - Optimizing ML inference costs and performance - Migrating from notebooks to production ML systems - Implementing continuous training and automated retraining ## Core Concepts ### 1. Experiment Tracking Track experiments systematically to ensure reproducibility and collaboration. **Key Components:** - Parameters: Hyperparameters logged for each training run - Metrics: Performance measures tracked over time (accuracy, loss, F1) - Artifacts: Model weights, plots, datasets, configuration files - Metadata: Tags, descriptions, Git commit SHA, environment details **Platform Comparison:** **MLflow** (Open-source standard): - Framework-agnostic (PyTorch, TensorFlow, scikit-learn, XGBoost) - Self-hosted or cloud-agnostic deployment - Integrated model registry - Basic UI, adequate for most use cases - Free, requires infrastructure management **Weights & Biases** (SaaS, collaboration-focused): - Advanced visualization and dashboards - Integrated hyperparameter optimization (Sweeps) - Excellent team collaboration features - SaaS pricing scales with usage - Best-in-class UI **Neptune.ai** (Enterprise