Back to Skills

langchain-rate-limits

verified
View on GitHub

Marketplace

claude-code-plugins-plus

jeremylongshore/claude-code-plugins-plus-skills

Plugin

langchain-pack

ai-ml

Repository

jeremylongshore/claude-code-plugins-plus-skills
1.1kstars

plugins/saas-packs/langchain-pack/skills/langchain-rate-limits/SKILL.md

Last Verified

January 22, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/blob/main/plugins/saas-packs/langchain-pack/skills/langchain-rate-limits/SKILL.md -a claude-code --skill langchain-rate-limits

Installation paths:

Claude
.claude/skills/langchain-rate-limits/
Powered by add-skill CLI

Instructions

# LangChain Rate Limits

## Overview
Implement robust rate limiting and retry strategies for LangChain applications to handle API quotas gracefully.

## Prerequisites
- LangChain installed with LLM provider
- Understanding of provider rate limits
- tenacity package for advanced retry logic

## Instructions

### Step 1: Understand Provider Limits
```python
# Common rate limits by provider:
RATE_LIMITS = {
    "openai": {
        "gpt-4o": {"rpm": 10000, "tpm": 800000},
        "gpt-4o-mini": {"rpm": 10000, "tpm": 4000000},
    },
    "anthropic": {
        "claude-3-5-sonnet": {"rpm": 4000, "tpm": 400000},
    },
    "google": {
        "gemini-1.5-pro": {"rpm": 360, "tpm": 4000000},
    }
}
# rpm = requests per minute, tpm = tokens per minute
```

### Step 2: Built-in Retry Configuration
```python
from langchain_openai import ChatOpenAI

# LangChain has built-in retry with exponential backoff
llm = ChatOpenAI(
    model="gpt-4o-mini",
    max_retries=3,  # Number of retries
    request_timeout=30,  # Timeout per request
)
```

### Step 3: Advanced Retry with Tenacity
```python
from tenacity import (
    retry,
    stop_after_attempt,
    wait_exponential,
    retry_if_exception_type
)
from openai import RateLimitError, APIError

@retry(
    stop=stop_after_attempt(5),
    wait=wait_exponential(multiplier=1, min=4, max=60),
    retry=retry_if_exception_type((RateLimitError, APIError))
)
def call_with_retry(chain, input_data):
    """Call chain with exponential backoff."""
    return chain.invoke(input_data)

# Usage
result = call_with_retry(chain, {"input": "Hello"})
```

### Step 4: Rate Limiter Wrapper
```python
import asyncio
import time
from collections import deque
from threading import Lock

class RateLimiter:
    """Token bucket rate limiter for API calls."""

    def __init__(self, requests_per_minute: int = 60):
        self.rpm = requests_per_minute
        self.interval = 60.0 / requests_per_minute
        self.timestamps = deque()
        self.lock = Lock

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
4617 chars