Back to Skills

hyde-retrieval

verified

HyDE (Hypothetical Document Embeddings) for improved semantic retrieval. Use when queries don't match document vocabulary, retrieval quality is poor, or implementing advanced RAG patterns.

View on GitHub

Marketplace

orchestkit

yonatangross/skillforge-claude-plugin

Plugin

ork-rag

ai

Repository

yonatangross/skillforge-claude-plugin
33stars

plugins/ork-rag/skills/hyde-retrieval/SKILL.md

Last Verified

January 25, 2026

Install Skill

Select agents to install to:

Scope:
npx add-skill https://github.com/yonatangross/skillforge-claude-plugin/blob/main/plugins/ork-rag/skills/hyde-retrieval/SKILL.md -a claude-code --skill hyde-retrieval

Installation paths:

Claude
.claude/skills/hyde-retrieval/
Powered by add-skill CLI

Instructions

# HyDE (Hypothetical Document Embeddings)

Generate hypothetical answer documents to bridge vocabulary gaps in semantic search.

## The Problem

Direct query embedding often fails due to vocabulary mismatch:
```
Query: "scaling async data pipelines"
Docs use: "event-driven messaging", "Apache Kafka", "message brokers"
→ Low similarity scores despite high relevance
```

## The Solution

Instead of embedding the query, generate a hypothetical answer document:
```
Query: "scaling async data pipelines"
→ LLM generates: "To scale asynchronous data pipelines, use event-driven
   messaging with Apache Kafka. Message brokers provide backpressure..."
→ Embed the hypothetical document
→ Now matches docs using similar terminology
```

## Implementation

```python
from openai import AsyncOpenAI
from pydantic import BaseModel, Field

class HyDEResult(BaseModel):
    """Result of HyDE generation."""
    original_query: str
    hypothetical_doc: str
    embedding: list[float]

async def generate_hyde(
    query: str,
    llm: AsyncOpenAI,
    embed_fn: callable,
    max_tokens: int = 150,
) -> HyDEResult:
    """Generate hypothetical document and embed it."""

    # Generate hypothetical answer
    response = await llm.chat.completions.create(
        model="gpt-4o-mini",  # Fast, cheap model
        messages=[
            {"role": "system", "content":
                "Write a short paragraph that would answer this query. "
                "Use technical terminology that documentation would use."},
            {"role": "user", "content": query}
        ],
        max_tokens=max_tokens,
        temperature=0.3,  # Low temp for consistency
    )

    hypothetical_doc = response.choices[0].message.content

    # Embed the hypothetical document (not the query!)
    embedding = await embed_fn(hypothetical_doc)

    return HyDEResult(
        original_query=query,
        hypothetical_doc=hypothetical_doc,
        embedding=embedding,
    )
```

## With Caching

```python
from functools

Validation Details

Front Matter
Required Fields
Valid Name Format
Valid Description
Has Sections
Allowed Tools
Instruction Length:
4541 chars