Provider-native prompt caching for Claude and OpenAI. Use when optimizing LLM costs with cache breakpoints, caching system prompts, or reducing token costs for repeated prefixes.
View on GitHubyonatangross/skillforge-claude-plugin
ork
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/yonatangross/skillforge-claude-plugin/blob/main/plugins/ork/skills/prompt-caching/SKILL.md -a claude-code --skill prompt-cachingInstallation paths:
.claude/skills/prompt-caching/# Prompt Caching
Cache LLM prompt prefixes for 90% token savings.
## Supported Models (2026)
| Provider | Models |
|----------|--------|
| Claude | Opus 4.1, Opus 4, Sonnet 4.5, Sonnet 4, Sonnet 3.7, Haiku 4.5, Haiku 3.5, Haiku 3 |
| OpenAI | gpt-4o, gpt-4o-mini, o1, o1-mini (automatic caching) |
## Claude Prompt Caching
```python
def build_cached_messages(
system_prompt: str,
few_shot_examples: str | None,
user_content: str,
use_extended_cache: bool = False
) -> list[dict]:
"""Build messages with cache breakpoints.
Cache structure (processing order: tools → system → messages):
1. System prompt (cached)
2. Few-shot examples (cached)
─────── CACHE BREAKPOINT ───────
3. User content (NOT cached)
"""
# TTL: "5m" (default, 1.25x write cost) or "1h" (extended, 2x write cost)
ttl = "1h" if use_extended_cache else "5m"
content_parts = []
# Breakpoint 1: System prompt
content_parts.append({
"type": "text",
"text": system_prompt,
"cache_control": {"type": "ephemeral", "ttl": ttl}
})
# Breakpoint 2: Few-shot examples (up to 4 breakpoints allowed)
if few_shot_examples:
content_parts.append({
"type": "text",
"text": few_shot_examples,
"cache_control": {"type": "ephemeral", "ttl": ttl}
})
# Dynamic content (NOT cached)
content_parts.append({
"type": "text",
"text": user_content
})
return [{"role": "user", "content": content_parts}]
```
## Cache Pricing (2026)
```
┌─────────────────────────────────────────────────────────────┐
│ Cache Cost Multipliers (relative to base input price) │
├─────────────────────────────────────────────────────────────┤
│ 5-minute cache write: 1.25x base input price │
│ 1-hour cache write: 2.00x base input price │
│ Cache read: 0.10x base input price (90% off!) │
└─────────────────────────────────────────────