Back to Skills

streaming-data

verified

Build event streaming and real-time data pipelines with Kafka, Pulsar, Redpanda, Flink, and Spark. Covers producer/consumer patterns, stream processing, event sourcing, and CDC across TypeScript, Python, Go, and Java. When building real-time systems, microservices communication, or data integration pipelines.

View on GitHub

Marketplace

ai-design-components

ancoleman/ai-design-components

Plugin

backend-ai-skills

Repository

ancoleman/ai-design-components
153stars

skills/streaming-data/SKILL.md

Last Verified

February 1, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/ancoleman/ai-design-components/blob/main/skills/streaming-data/SKILL.md -a claude-code --skill streaming-data

Installation paths:

Claude
.claude/skills/streaming-data/
Powered by add-skill CLI

Instructions

# Streaming Data Processing

Build production-ready event streaming systems and real-time data pipelines using modern message brokers and stream processors.

## When to Use This Skill

Use this skill when:
- Building event-driven architectures and microservices communication
- Processing real-time analytics, monitoring, or alerting systems
- Implementing data integration pipelines (CDC, ETL/ELT)
- Creating log or metrics aggregation systems
- Developing IoT platforms or high-frequency trading systems

## Core Concepts

### Message Brokers vs Stream Processors

**Message Brokers** (Kafka, Pulsar, Redpanda):
- Store and distribute event streams
- Provide durability, replay capability, partitioning
- Handle producer/consumer coordination

**Stream Processors** (Flink, Spark, Kafka Streams):
- Transform and aggregate streaming data
- Provide windowing, joins, stateful operations
- Execute complex event processing (CEP)

### Delivery Guarantees

**At-Most-Once**:
- Messages may be lost, no duplicates
- Lowest overhead
- Use for: Metrics, logs where loss is acceptable

**At-Least-Once**:
- Messages never lost, may have duplicates
- Moderate overhead, requires idempotent consumers
- Use for: Most applications (default choice)

**Exactly-Once**:
- Messages never lost or duplicated
- Highest overhead, requires transactional processing
- Use for: Financial transactions, critical state updates

## Quick Start Guide

### Step 1: Choose a Message Broker

See references/broker-selection.md for detailed comparison.

**Quick decision**:
- **Apache Kafka**: Mature ecosystem, enterprise features, event sourcing
- **Redpanda**: Low latency, Kafka-compatible, simpler operations (no ZooKeeper)
- **Apache Pulsar**: Multi-tenancy, geo-replication, tiered storage
- **RabbitMQ**: Traditional message queues, RPC patterns

### Step 2: Choose a Stream Processor (if needed)

See references/processor-selection.md for detailed comparison.

**Quick decision**:
- **Apache Flink**: Millisecond laten

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
11005 chars