Build Retrieval-Augmented Generation (RAG) systems for LLM applications with vector databases and semantic search. Use when implementing knowledge-grounded AI, building document Q&A systems, or integrating LLMs with external knowledge bases.
View on GitHubwshobson/agents
llm-application-dev
January 19, 2026
Select agents to install to:
npx add-skill https://github.com/wshobson/agents/blob/main/plugins/llm-application-dev/skills/rag-implementation/SKILL.md -a claude-code --skill rag-implementationInstallation paths:
.claude/skills/rag-implementation/# RAG Implementation Master Retrieval-Augmented Generation (RAG) to build LLM applications that provide accurate, grounded responses using external knowledge sources. ## When to Use This Skill - Building Q&A systems over proprietary documents - Creating chatbots with current, factual information - Implementing semantic search with natural language queries - Reducing hallucinations with grounded responses - Enabling LLMs to access domain-specific knowledge - Building documentation assistants - Creating research tools with source citation ## Core Components ### 1. Vector Databases **Purpose**: Store and retrieve document embeddings efficiently **Options:** - **Pinecone**: Managed, scalable, serverless - **Weaviate**: Open-source, hybrid search, GraphQL - **Milvus**: High performance, on-premise - **Chroma**: Lightweight, easy to use, local development - **Qdrant**: Fast, filtered search, Rust-based - **pgvector**: PostgreSQL extension, SQL integration ### 2. Embeddings **Purpose**: Convert text to numerical vectors for similarity search **Models (2026):** | Model | Dimensions | Best For | |-------|------------|----------| | **voyage-3-large** | 1024 | Claude apps (Anthropic recommended) | | **voyage-code-3** | 1024 | Code search | | **text-embedding-3-large** | 3072 | OpenAI apps, high accuracy | | **text-embedding-3-small** | 1536 | OpenAI apps, cost-effective | | **bge-large-en-v1.5** | 1024 | Open source, local deployment | | **multilingual-e5-large** | 1024 | Multi-language support | ### 3. Retrieval Strategies **Approaches:** - **Dense Retrieval**: Semantic similarity via embeddings - **Sparse Retrieval**: Keyword matching (BM25, TF-IDF) - **Hybrid Search**: Combine dense + sparse with weighted fusion - **Multi-Query**: Generate multiple query variations - **HyDE**: Generate hypothetical documents for better retrieval ### 4. Reranking **Purpose**: Improve retrieval quality by reordering results **Methods:** - **Cross-Encoders**: BERT-based reran