drift-detection

# Drift Detection

Monitor LLM quality degradation and input/output distribution shifts in production.

## Overview

- Detecting input distribution drift (data drift)
- Monitoring output quality degradation (concept drift)
- Implementing statistical methods (PSI, KS, KL divergence)
- Setting up dynamic thresholds with moving averages
- Integrating Langfuse scores with drift analysis

## Quick Reference

### Population Stability Index (PSI)

```python
import numpy as np

def calculate_psi(expected: np.ndarray, actual: np.ndarray, bins: int = 10) -> float:
    """
    Calculate Population Stability Index.

    Thresholds:
    - PSI < 0.1: No significant drift
    - 0.1 <= PSI < 0.25: Moderate drift, investigate
    - PSI >= 0.25: Significant drift, action needed
    """
    expected_pct = np.histogram(expected, bins=bins)[0] / len(expected)
    actual_pct = np.histogram(actual, bins=bins)[0] / len(actual)

    # Avoid division by zero
    expected_pct = np.clip(expected_pct, 0.0001, None)
    actual_pct = np.clip(actual_pct, 0.0001, None)

    psi = np.sum((actual_pct - expected_pct) * np.log(actual_pct / expected_pct))
    return psi

# Usage
psi_score = calculate_psi(baseline_scores, current_scores)
if psi_score >= 0.25:
    alert("Significant quality drift detected!")
```

### EWMA Dynamic Threshold

```python
class EWMADriftDetector:
    """Exponential Weighted Moving Average for drift detection."""

    def __init__(self, lambda_param: float = 0.2, L: float = 3.0):
        self.lambda_param = lambda_param  # Smoothing factor
        self.L = L  # Control limit multiplier
        self.ewma = None

    def update(self, value: float, baseline_mean: float, baseline_std: float) -> dict:
        if self.ewma is None:
            self.ewma = value
        else:
            self.ewma = self.lambda_param * value + (1 - self.lambda_param) * self.ewma

        # Calculate control limits
        factor = np.sqrt(self.lambda_param / (2 - self.lambda_param))
        ucl = base
Marketplace

Plugin

Repository

Last Verified

Install Skill

Instructions

Validation Details