Use when implementing rate limiting, throttling, or API quotas. Covers algorithms like token bucket and sliding window, plus distributed rate limiting patterns.
View on GitHubmelodic-software/claude-code-plugins
systems-design
plugins/systems-design/skills/rate-limiting-patterns/SKILL.md
January 21, 2026
Select agents to install to:
npx add-skill https://github.com/melodic-software/claude-code-plugins/blob/main/plugins/systems-design/skills/rate-limiting-patterns/SKILL.md -a claude-code --skill rate-limiting-patternsInstallation paths:
.claude/skills/rate-limiting-patterns/# Rate Limiting Patterns
Patterns for protecting APIs and services through rate limiting, throttling, and quota management.
## When to Use This Skill
- Implementing API rate limiting
- Choosing rate limiting algorithms
- Designing distributed rate limiting
- Setting up quota management
- Protecting against abuse
## Why Rate Limiting
```text
Protection against:
- DDoS attacks
- Brute force attempts
- Resource exhaustion
- Cost overruns (cloud APIs)
- Cascading failures
Business benefits:
- Fair resource allocation
- Predictable performance
- Cost control
- SLA enforcement
```
## Rate Limiting Algorithms
### Token Bucket
**Concept:** Tokens added at fixed rate, requests consume tokens
```text
Configuration:
- Bucket size (max tokens): 100
- Refill rate: 10 tokens/second
Behavior:
┌─────────────────────────┐
│ Bucket (capacity: 100) │
│ ████████████░░░░░░░░░░ │ 60 tokens available
└─────────────────────────┘
↑ ↓
10 tokens/s Request takes 1 token
Allows bursts up to bucket size, then rate-limited.
```
**Characteristics:**
- Allows controlled bursts
- Simple to implement
- Memory efficient
- Most common algorithm
**Implementation sketch:**
```text
token_bucket:
tokens = min(tokens + (now - last_update) * rate, capacity)
if tokens >= cost:
tokens -= cost
return ALLOW
return DENY
```
### Leaky Bucket
**Concept:** Requests queue and process at fixed rate
```text
┌─────────────────────────┐
│ Queue (capacity: 100) │
│ ██████████████████████ │ Requests waiting
└──────────┬──────────────┘
│
▼ Process at fixed rate (10/sec)
[Processing]
Smooths traffic to constant rate.
```
**Characteristics:**
- Smooth output rate
- No bursts allowed
- Requests may queue
- Good for downstream protection
### Fixed Window
**Concept:** Count requests in fixed time windows
```text
Window: 1 minute, Limit: 100 requests
|-------- Window 1 --------|-------- Window 2 --------|
95 requests