Back to Skills

langchain-cost-tuning

verified
View on GitHub

Marketplace

claude-code-plugins-plus

jeremylongshore/claude-code-plugins-plus-skills

Plugin

langchain-pack

ai-ml

Repository

jeremylongshore/claude-code-plugins-plus-skills
1.1kstars

plugins/saas-packs/langchain-pack/skills/langchain-cost-tuning/SKILL.md

Last Verified

January 22, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/jeremylongshore/claude-code-plugins-plus-skills/blob/main/plugins/saas-packs/langchain-pack/skills/langchain-cost-tuning/SKILL.md -a claude-code --skill langchain-cost-tuning

Installation paths:

Claude
.claude/skills/langchain-cost-tuning/
Powered by add-skill CLI

Instructions

# LangChain Cost Tuning

## Overview
Strategies for reducing LLM API costs while maintaining quality in LangChain applications.

## Prerequisites
- LangChain application in production
- Access to API usage dashboard
- Understanding of token pricing

## Instructions

### Step 1: Understand Token Pricing
```python
# Current approximate pricing (check provider for current rates)
PRICING = {
    "openai": {
        "gpt-4o": {"input": 0.005, "output": 0.015},      # per 1K tokens
        "gpt-4o-mini": {"input": 0.00015, "output": 0.0006},
        "gpt-3.5-turbo": {"input": 0.0005, "output": 0.0015},
    },
    "anthropic": {
        "claude-3-5-sonnet": {"input": 0.003, "output": 0.015},
        "claude-3-haiku": {"input": 0.00025, "output": 0.00125},
    },
    "google": {
        "gemini-1.5-pro": {"input": 0.00125, "output": 0.005},
        "gemini-1.5-flash": {"input": 0.000075, "output": 0.0003},
    }
}

def estimate_cost(
    input_tokens: int,
    output_tokens: int,
    model: str = "gpt-4o-mini"
) -> float:
    """Estimate API cost for a request."""
    provider, model_name = model.split("/") if "/" in model else ("openai", model)
    rates = PRICING.get(provider, {}).get(model_name, {"input": 0.001, "output": 0.002})
    return (input_tokens / 1000 * rates["input"]) + (output_tokens / 1000 * rates["output"])
```

### Step 2: Implement Token Counting
```python
import tiktoken
from langchain_core.callbacks import BaseCallbackHandler

class CostTrackingCallback(BaseCallbackHandler):
    """Track token usage and costs."""

    def __init__(self, model: str = "gpt-4o-mini"):
        self.model = model
        self.total_input_tokens = 0
        self.total_output_tokens = 0
        self.requests = 0

    def on_llm_end(self, response, **kwargs) -> None:
        """Track tokens from LLM response."""
        if response.llm_output and "token_usage" in response.llm_output:
            usage = response.llm_output["token_usage"]
            self.total_input_tokens += u

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
6656 chars