Retrieval-Augmented Generation patterns for grounded LLM responses. Use when building RAG pipelines, constructing context from retrieved documents, adding citations, or implementing hybrid search.
View on GitHubyonatangross/skillforge-claude-plugin
ork-rag
January 25, 2026
Select agents to install to:
npx add-skill https://github.com/yonatangross/skillforge-claude-plugin/blob/main/plugins/ork-rag/skills/rag-retrieval/SKILL.md -a claude-code --skill rag-retrievalInstallation paths:
.claude/skills/rag-retrieval/# RAG Retrieval
Combine vector search with LLM generation for accurate, grounded responses.
## Basic RAG Pattern
```python
async def rag_query(question: str, top_k: int = 5) -> str:
"""Basic RAG: retrieve then generate."""
# 1. Retrieve relevant documents
docs = await vector_db.search(question, limit=top_k)
# 2. Construct context
context = "\n\n".join([
f"[{i+1}] {doc.text}"
for i, doc in enumerate(docs)
])
# 3. Generate with context
response = await llm.chat([
{"role": "system", "content":
"Answer using ONLY the provided context. "
"If not in context, say 'I don't have that information.'"},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
])
return response.content
```
## RAG with Citations
```python
async def rag_with_citations(question: str) -> dict:
"""RAG with inline citations [1], [2], etc."""
docs = await vector_db.search(question, limit=5)
context = "\n\n".join([
f"[{i+1}] {doc.text}\nSource: {doc.metadata['source']}"
for i, doc in enumerate(docs)
])
response = await llm.chat([
{"role": "system", "content":
"Answer with inline citations like [1], [2]. "
"End with a Sources section."},
{"role": "user", "content": f"Context:\n{context}\n\nQuestion: {question}"}
])
return {
"answer": response.content,
"sources": [doc.metadata['source'] for doc in docs]
}
```
## Hybrid Search (Semantic + Keyword)
```python
def reciprocal_rank_fusion(
semantic_results: list,
keyword_results: list,
k: int = 60
) -> list:
"""Combine semantic and keyword search with RRF."""
scores = {}
for rank, doc in enumerate(semantic_results):
scores[doc.id] = scores.get(doc.id, 0) + 1 / (k + rank + 1)
for rank, doc in enumerate(keyword_results):
scores[doc.id] = scores.get(doc.id, 0) + 1 / (k + rank + 1)
# So