ClickHouse database patterns, query optimization, analytics, and data engineering best practices for high-performance analytical workloads.
View on GitHubysyecust/everything-claude-code
everything-claude-code
skills/clickhouse-io/SKILL.md
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/ysyecust/everything-claude-code/blob/main/skills/clickhouse-io/SKILL.md -a claude-code --skill clickhouse-ioInstallation paths:
.claude/skills/clickhouse-io/# ClickHouse Analytics Patterns
ClickHouse-specific patterns for high-performance analytics and data engineering.
## Overview
ClickHouse is a column-oriented database management system (DBMS) for online analytical processing (OLAP). It's optimized for fast analytical queries on large datasets.
**Key Features:**
- Column-oriented storage
- Data compression
- Parallel query execution
- Distributed queries
- Real-time analytics
## Table Design Patterns
### MergeTree Engine (Most Common)
```sql
CREATE TABLE markets_analytics (
date Date,
market_id String,
market_name String,
volume UInt64,
trades UInt32,
unique_traders UInt32,
avg_trade_size Float64,
created_at DateTime
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(date)
ORDER BY (date, market_id)
SETTINGS index_granularity = 8192;
```
### ReplacingMergeTree (Deduplication)
```sql
-- For data that may have duplicates (e.g., from multiple sources)
CREATE TABLE user_events (
event_id String,
user_id String,
event_type String,
timestamp DateTime,
properties String
) ENGINE = ReplacingMergeTree()
PARTITION BY toYYYYMM(timestamp)
ORDER BY (user_id, event_id, timestamp)
PRIMARY KEY (user_id, event_id);
```
### AggregatingMergeTree (Pre-aggregation)
```sql
-- For maintaining aggregated metrics
CREATE TABLE market_stats_hourly (
hour DateTime,
market_id String,
total_volume AggregateFunction(sum, UInt64),
total_trades AggregateFunction(count, UInt32),
unique_users AggregateFunction(uniq, String)
) ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (hour, market_id);
-- Query aggregated data
SELECT
hour,
market_id,
sumMerge(total_volume) AS volume,
countMerge(total_trades) AS trades,
uniqMerge(unique_users) AS users
FROM market_stats_hourly
WHERE hour >= toStartOfHour(now() - INTERVAL 24 HOUR)
GROUP BY hour, market_id
ORDER BY hour DESC;
```
## Query Optimization Patterns
### Efficient Filtering
```sql
-- ✅