Build event streaming and real-time data pipelines with Kafka, Pulsar, Redpanda, Flink, and Spark. Covers producer/consumer patterns, stream processing, event sourcing, and CDC across TypeScript, Python, Go, and Java. When building real-time systems, microservices communication, or data integration pipelines.
View on GitHubancoleman/ai-design-components
backend-ai-skills
February 1, 2026
Select agents to install to:
npx add-skill https://github.com/ancoleman/ai-design-components/blob/main/skills/streaming-data/SKILL.md -a claude-code --skill streaming-dataInstallation paths:
.claude/skills/streaming-data/# Streaming Data Processing Build production-ready event streaming systems and real-time data pipelines using modern message brokers and stream processors. ## When to Use This Skill Use this skill when: - Building event-driven architectures and microservices communication - Processing real-time analytics, monitoring, or alerting systems - Implementing data integration pipelines (CDC, ETL/ELT) - Creating log or metrics aggregation systems - Developing IoT platforms or high-frequency trading systems ## Core Concepts ### Message Brokers vs Stream Processors **Message Brokers** (Kafka, Pulsar, Redpanda): - Store and distribute event streams - Provide durability, replay capability, partitioning - Handle producer/consumer coordination **Stream Processors** (Flink, Spark, Kafka Streams): - Transform and aggregate streaming data - Provide windowing, joins, stateful operations - Execute complex event processing (CEP) ### Delivery Guarantees **At-Most-Once**: - Messages may be lost, no duplicates - Lowest overhead - Use for: Metrics, logs where loss is acceptable **At-Least-Once**: - Messages never lost, may have duplicates - Moderate overhead, requires idempotent consumers - Use for: Most applications (default choice) **Exactly-Once**: - Messages never lost or duplicated - Highest overhead, requires transactional processing - Use for: Financial transactions, critical state updates ## Quick Start Guide ### Step 1: Choose a Message Broker See references/broker-selection.md for detailed comparison. **Quick decision**: - **Apache Kafka**: Mature ecosystem, enterprise features, event sourcing - **Redpanda**: Low latency, Kafka-compatible, simpler operations (no ZooKeeper) - **Apache Pulsar**: Multi-tenancy, geo-replication, tiered storage - **RabbitMQ**: Traditional message queues, RPC patterns ### Step 2: Choose a Stream Processor (if needed) See references/processor-selection.md for detailed comparison. **Quick decision**: - **Apache Flink**: Millisecond laten