jeremylongshore/claude-code-plugins-plus-skills
langchain-pack
plugins/saas-packs/langchain-pack/skills/langchain-rate-limits/SKILL.md
January 22, 2026
Select agents to install to:
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/blob/main/plugins/saas-packs/langchain-pack/skills/langchain-rate-limits/SKILL.md -a claude-code --skill langchain-rate-limitsInstallation paths:
.claude/skills/langchain-rate-limits/# LangChain Rate Limits
## Overview
Implement robust rate limiting and retry strategies for LangChain applications to handle API quotas gracefully.
## Prerequisites
- LangChain installed with LLM provider
- Understanding of provider rate limits
- tenacity package for advanced retry logic
## Instructions
### Step 1: Understand Provider Limits
```python
# Common rate limits by provider:
RATE_LIMITS = {
"openai": {
"gpt-4o": {"rpm": 10000, "tpm": 800000},
"gpt-4o-mini": {"rpm": 10000, "tpm": 4000000},
},
"anthropic": {
"claude-3-5-sonnet": {"rpm": 4000, "tpm": 400000},
},
"google": {
"gemini-1.5-pro": {"rpm": 360, "tpm": 4000000},
}
}
# rpm = requests per minute, tpm = tokens per minute
```
### Step 2: Built-in Retry Configuration
```python
from langchain_openai import ChatOpenAI
# LangChain has built-in retry with exponential backoff
llm = ChatOpenAI(
model="gpt-4o-mini",
max_retries=3, # Number of retries
request_timeout=30, # Timeout per request
)
```
### Step 3: Advanced Retry with Tenacity
```python
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type
)
from openai import RateLimitError, APIError
@retry(
stop=stop_after_attempt(5),
wait=wait_exponential(multiplier=1, min=4, max=60),
retry=retry_if_exception_type((RateLimitError, APIError))
)
def call_with_retry(chain, input_data):
"""Call chain with exponential backoff."""
return chain.invoke(input_data)
# Usage
result = call_with_retry(chain, {"input": "Hello"})
```
### Step 4: Rate Limiter Wrapper
```python
import asyncio
import time
from collections import deque
from threading import Lock
class RateLimiter:
"""Token bucket rate limiter for API calls."""
def __init__(self, requests_per_minute: int = 60):
self.rpm = requests_per_minute
self.interval = 60.0 / requests_per_minute
self.timestamps = deque()
self.lock = Lock