Back to Skills

etl-patterns

verified

Production ETL patterns orchestrator. Routes to core reliability patterns and incremental load strategies.

View on GitHub

Marketplace

majestic-marketplace

majesticlabs-dev/majestic-marketplace

Plugin

majestic-data

Repository

majesticlabs-dev/majestic-marketplace
19stars

plugins/majestic-data/skills/etl-patterns/SKILL.md

Last Verified

January 24, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/majesticlabs-dev/majestic-marketplace/blob/main/plugins/majestic-data/skills/etl-patterns/SKILL.md -a claude-code --skill etl-patterns

Installation paths:

Claude
.claude/skills/etl-patterns/
Powered by add-skill CLI

Instructions

# ETL Patterns

Orchestrator for production-grade Extract-Transform-Load patterns.

## Skill Routing

| Need | Skill | Content |
|------|-------|---------|
| Reliability patterns | `etl-core-patterns` | Idempotency, checkpointing, error handling, chunking, retry, logging |
| Load strategies | `etl-incremental-patterns` | Backfill, timestamp-based, CDC, pipeline orchestration |

## Pattern Selection Guide

### By Reliability Need

| Need | Pattern | Skill |
|------|---------|-------|
| Repeatable runs | Idempotency | `etl-core-patterns` |
| Resume after failure | Checkpointing | `etl-core-patterns` |
| Handle bad records | Error handling + DLQ | `etl-core-patterns` |
| Memory management | Chunked processing | `etl-core-patterns` |
| Network resilience | Retry with backoff | `etl-core-patterns` |
| Observability | Structured logging | `etl-core-patterns` |

### By Load Strategy

| Scenario | Pattern | Skill |
|----------|---------|-------|
| Small tables (<100K) | Full refresh | `etl-incremental-patterns` |
| Large tables | Timestamp incremental | `etl-incremental-patterns` |
| Real-time sync | CDC events | `etl-incremental-patterns` |
| Historical migration | Parallel backfill | `etl-incremental-patterns` |
| Zero-downtime refresh | Swap pattern | `etl-incremental-patterns` |
| Multi-step pipelines | Pipeline orchestration | `etl-incremental-patterns` |

## Quick Reference

### Idempotency Options

```python
# Small datasets: Delete-then-insert
# Large datasets: UPSERT on conflict
# Change detection: Row hash comparison
```

### Load Strategy Decision

```
Is table < 100K rows?
  → Full refresh

Has reliable timestamp column?
  → Timestamp incremental

Source supports CDC?
  → CDC event processing

Need zero downtime?
  → Swap pattern (temp table → rename)

One-time historical load?
  → Parallel backfill with date ranges
```

## Common Pipeline Structure

```python
# 1. Setup
checkpoint = Checkpoint('.etl_checkpoint.json')
processor = ETLProcessor()

# 2. Extract (wi

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
2518 chars