Detect anomalies in data using statistical and ML methods. Z-score, IQR, Isolation Forest, and time-series anomalies.
View on GitHubmajesticlabs-dev/majestic-marketplace
majestic-data
January 24, 2026
Select agents to install to:
npx add-skill https://github.com/majesticlabs-dev/majestic-marketplace/blob/main/plugins/majestic-data/skills/anomaly-detector/SKILL.md -a claude-code --skill anomaly-detectorInstallation paths:
.claude/skills/anomaly-detector/# Anomaly Detector
**Audience:** Data engineers and analysts detecting outliers in datasets.
**Goal:** Provide production-ready anomaly detection functions for various data types.
## Scripts
Execute detection functions from `scripts/anomaly_detection.py`:
```python
from scripts.anomaly_detection import (
detect_anomalies_zscore,
detect_anomalies_iqr,
detect_anomalies_modified_zscore,
detect_anomalies_isolation_forest,
detect_anomalies_lof,
detect_anomalies_rolling,
detect_anomalies_stl,
detect_anomalies_ensemble
)
```
## Method Selection
| Method | Best For | Limitations |
|--------|----------|-------------|
| Z-Score | Normal distributions | Sensitive to outliers |
| IQR | Skewed distributions | Less sensitive overall |
| Modified Z-Score | Robust detection | Slower computation |
| Isolation Forest | High-dimensional data | Requires tuning |
| LOF | Local density anomalies | Computationally expensive |
| Rolling | Time-series with trends | Window size sensitive |
| STL | Seasonal time-series | Requires known period |
## Usage Examples
### Single Column Detection
```python
import pandas as pd
from scripts.anomaly_detection import detect_anomalies_zscore, detect_anomalies_iqr
df = pd.read_csv('data.csv')
# Z-score method (good for normal distributions)
anomalies_z = detect_anomalies_zscore(df['value'], threshold=3.0)
# IQR method (robust to skewed data)
anomalies_iqr = detect_anomalies_iqr(df['value'], multiplier=1.5)
print(f"Z-score found {anomalies_z.sum()} anomalies")
print(f"IQR found {anomalies_iqr.sum()} anomalies")
```
### Multi-Column with Isolation Forest
```python
from scripts.anomaly_detection import detect_anomalies_isolation_forest
numeric_cols = ['revenue', 'quantity', 'price']
anomalies = detect_anomalies_isolation_forest(df, numeric_cols, contamination=0.01)
df_anomalies = df[anomalies]
```
### Ensemble Approach (Recommended)
```python
from scripts.anomaly_detection import detect_anomalies_ensemble