jeremylongshore/claude-code-plugins-plus-skills
langchain-pack
plugins/saas-packs/langchain-pack/skills/langchain-cost-tuning/SKILL.md
January 22, 2026
Select agents to install to:
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/blob/main/plugins/saas-packs/langchain-pack/skills/langchain-cost-tuning/SKILL.md -a claude-code --skill langchain-cost-tuningInstallation paths:
.claude/skills/langchain-cost-tuning/# LangChain Cost Tuning
## Overview
Strategies for reducing LLM API costs while maintaining quality in LangChain applications.
## Prerequisites
- LangChain application in production
- Access to API usage dashboard
- Understanding of token pricing
## Instructions
### Step 1: Understand Token Pricing
```python
# Current approximate pricing (check provider for current rates)
PRICING = {
"openai": {
"gpt-4o": {"input": 0.005, "output": 0.015}, # per 1K tokens
"gpt-4o-mini": {"input": 0.00015, "output": 0.0006},
"gpt-3.5-turbo": {"input": 0.0005, "output": 0.0015},
},
"anthropic": {
"claude-3-5-sonnet": {"input": 0.003, "output": 0.015},
"claude-3-haiku": {"input": 0.00025, "output": 0.00125},
},
"google": {
"gemini-1.5-pro": {"input": 0.00125, "output": 0.005},
"gemini-1.5-flash": {"input": 0.000075, "output": 0.0003},
}
}
def estimate_cost(
input_tokens: int,
output_tokens: int,
model: str = "gpt-4o-mini"
) -> float:
"""Estimate API cost for a request."""
provider, model_name = model.split("/") if "/" in model else ("openai", model)
rates = PRICING.get(provider, {}).get(model_name, {"input": 0.001, "output": 0.002})
return (input_tokens / 1000 * rates["input"]) + (output_tokens / 1000 * rates["output"])
```
### Step 2: Implement Token Counting
```python
import tiktoken
from langchain_core.callbacks import BaseCallbackHandler
class CostTrackingCallback(BaseCallbackHandler):
"""Track token usage and costs."""
def __init__(self, model: str = "gpt-4o-mini"):
self.model = model
self.total_input_tokens = 0
self.total_output_tokens = 0
self.requests = 0
def on_llm_end(self, response, **kwargs) -> None:
"""Track tokens from LLM response."""
if response.llm_output and "token_usage" in response.llm_output:
usage = response.llm_output["token_usage"]
self.total_input_tokens += u