Back to Skills

model-deployment

verified

Deploy ML models with FastAPI, Docker, Kubernetes. Use for serving predictions, containerization, monitoring, drift detection, or encountering latency issues, health check failures, version conflicts.

View on GitHub

Marketplace

claude-skills

secondsky/claude-skills

Plugin

model-deployment

ai

Repository

secondsky/claude-skills
28stars

plugins/model-deployment/skills/model-deployment/SKILL.md

Last Verified

January 24, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/secondsky/claude-skills/blob/main/plugins/model-deployment/skills/model-deployment/SKILL.md -a claude-code --skill model-deployment

Installation paths:

Claude
.claude/skills/model-deployment/
Powered by add-skill CLI

Instructions

# ML Model Deployment

Deploy trained models to production with proper serving and monitoring.

## Deployment Options

| Method | Use Case | Latency |
|--------|----------|---------|
| REST API | Web services | Medium |
| Batch | Large-scale processing | N/A |
| Streaming | Real-time | Low |
| Edge | On-device | Very low |

## FastAPI Model Server

```python
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

app = FastAPI()
model = joblib.load('model.pkl')

class PredictionRequest(BaseModel):
    features: list[float]

class PredictionResponse(BaseModel):
    prediction: float
    probability: float

@app.get('/health')
def health():
    return {'status': 'healthy'}

@app.post('/predict', response_model=PredictionResponse)
def predict(request: PredictionRequest):
    features = np.array(request.features).reshape(1, -1)
    prediction = model.predict(features)[0]
    probability = model.predict_proba(features)[0].max()
    return PredictionResponse(prediction=prediction, probability=probability)
```

## Docker Deployment

```dockerfile
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY model.pkl .
COPY app.py .

EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
```

## Model Monitoring

```python
class ModelMonitor:
    def __init__(self):
        self.predictions = []
        self.latencies = []

    def log_prediction(self, input_data, prediction, latency):
        self.predictions.append({
            'input': input_data,
            'prediction': prediction,
            'latency': latency,
            'timestamp': datetime.now()
        })

    def detect_drift(self, reference_distribution):
        # Compare current predictions to reference
        pass
```

## Deployment Checklist

- [ ] Model validated on test set
- [ ] API endpoints documented
- [ ] Health check endpoint
- [ ] Authentication configured
- [ ] Logging and monit

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
8609 chars