Prepares ML models for production deployment with containerization, API creation, monitoring setup, and A/B testing. Activates for "deploy model", "production deployment", "model API", "containerize model", "docker ml", "serving ml model", "model monitoring", "A/B test model". Generates deployment artifacts and ensures models are production-ready with monitoring, versioning, and rollback capabilities.
View on GitHubanton-abyzov/specweave
sw-ml
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/anton-abyzov/specweave/blob/main/plugins/specweave-ml/skills/ml-deployment-helper/SKILL.md -a claude-code --skill ml-deployment-helperInstallation paths:
.claude/skills/ml-deployment-helper/# ML Deployment Helper
## Overview
Bridges the gap between trained models and production systems. Generates deployment artifacts, APIs, monitoring, and A/B testing infrastructure following MLOps best practices.
## Deployment Checklist
Before deploying any model, this skill ensures:
- ✅ Model versioned and tracked
- ✅ Dependencies documented (requirements.txt/Dockerfile)
- ✅ API endpoint created
- ✅ Input validation implemented
- ✅ Monitoring configured
- ✅ A/B testing ready
- ✅ Rollback plan documented
- ✅ Performance benchmarked
## Deployment Patterns
### Pattern 1: REST API (FastAPI)
```python
from specweave import create_model_api
# Generates production-ready API
api = create_model_api(
model_path="models/model-v3.pkl",
increment="0042",
framework="fastapi"
)
# Creates:
# - api/
# ├── main.py (FastAPI app)
# ├── models.py (Pydantic schemas)
# ├── predict.py (Prediction logic)
# ├── Dockerfile
# ├── requirements.txt
# └── tests/
```
Generated `main.py`:
```python
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import joblib
app = FastAPI(title="Recommendation Model API", version="0042-v3")
model = joblib.load("model-v3.pkl")
class PredictionRequest(BaseModel):
user_id: int
context: dict
@app.post("/predict")
async def predict(request: PredictionRequest):
try:
prediction = model.predict([request.dict()])
return {
"recommendations": prediction.tolist(),
"model_version": "0042-v3",
"timestamp": datetime.now()
}
except Exception as e:
raise HTTPException(status_code=500, detail=str(e))
@app.get("/health")
async def health():
return {"status": "healthy", "model_loaded": model is not None}
```
### Pattern 2: Batch Prediction
```python
from specweave import create_batch_predictor
# For offline scoring
batch_predictor = create_batch_predictor(
model_path="models/model-v3.pkl",
increment="0042",
input_path=