Anthropic's Contextual Retrieval technique for improved RAG. Use when chunks lose context during retrieval, implementing hybrid BM25+vector search, or reducing retrieval failures.
View on GitHubyonatangross/skillforge-claude-plugin
ork
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/yonatangross/skillforge-claude-plugin/blob/main/plugins/ork/skills/contextual-retrieval/SKILL.md -a claude-code --skill contextual-retrievalInstallation paths:
.claude/skills/contextual-retrieval/# Contextual Retrieval
Prepend situational context to chunks before embedding to preserve document-level meaning.
## The Problem
Traditional chunking loses context:
```
Original document: "ACME Q3 2024 Earnings Report..."
Chunk: "Revenue increased 15% compared to the previous quarter."
Query: "What was ACME's Q3 2024 revenue growth?"
Result: Chunk doesn't mention "ACME" or "Q3 2024" - retrieval fails
```
## The Solution
**Contextual Retrieval** prepends a brief context to each chunk:
```
Contextualized chunk:
"This chunk is from ACME Corp's Q3 2024 earnings report, specifically
the revenue section. Revenue increased 15% compared to the previous quarter."
```
## Implementation
### Context Generation
```python
import anthropic
client = anthropic.Anthropic()
CONTEXT_PROMPT = """
<document>
{document}
</document>
Here is the chunk we want to situate within the document:
<chunk>
{chunk}
</chunk>
Please give a short, succinct context (1-2 sentences) to situate this chunk
within the overall document. Focus on information that would help retrieval.
Answer only with the context, nothing else.
"""
def generate_context(document: str, chunk: str) -> str:
"""Generate context for a single chunk."""
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=150,
messages=[{
"role": "user",
"content": CONTEXT_PROMPT.format(document=document, chunk=chunk)
}]
)
return response.content[0].text
def contextualize_chunk(document: str, chunk: str) -> str:
"""Prepend context to chunk."""
context = generate_context(document, chunk)
return f"{context}\n\n{chunk}"
```
### Batch Processing with Caching
```python
from anthropic import Anthropic
client = Anthropic()
def contextualize_chunks_cached(document: str, chunks: list[str]) -> list[str]:
"""
Use prompt caching to efficiently process many chunks from same document.
Document is cached, only chunk changes per r