Implement efficient similarity search with vector databases. Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance.
View on GitHubwshobson/agents
llm-application-dev
January 19, 2026
Select agents to install to:
npx add-skill https://github.com/wshobson/agents/blob/main/plugins/llm-application-dev/skills/similarity-search-patterns/SKILL.md -a claude-code --skill similarity-search-patternsInstallation paths:
.claude/skills/similarity-search-patterns/# Similarity Search Patterns
Patterns for implementing efficient similarity search in production systems.
## When to Use This Skill
- Building semantic search systems
- Implementing RAG retrieval
- Creating recommendation engines
- Optimizing search latency
- Scaling to millions of vectors
- Combining semantic and keyword search
## Core Concepts
### 1. Distance Metrics
| Metric | Formula | Best For |
| ------------------ | ------------------ | --------------------- | --- | -------------- |
| **Cosine** | 1 - (A·B)/(‖A‖‖B‖) | Normalized embeddings |
| **Euclidean (L2)** | √Σ(a-b)² | Raw embeddings |
| **Dot Product** | A·B | Magnitude matters |
| **Manhattan (L1)** | Σ | a-b | | Sparse vectors |
### 2. Index Types
```
┌─────────────────────────────────────────────────┐
│ Index Types │
├─────────────┬───────────────┬───────────────────┤
│ Flat │ HNSW │ IVF+PQ │
│ (Exact) │ (Graph-based) │ (Quantized) │
├─────────────┼───────────────┼───────────────────┤
│ O(n) search │ O(log n) │ O(√n) │
│ 100% recall │ ~95-99% │ ~90-95% │
│ Small data │ Medium-Large │ Very Large │
└─────────────┴───────────────┴───────────────────┘
```
## Templates
### Template 1: Pinecone Implementation
```python
from pinecone import Pinecone, ServerlessSpec
from typing import List, Dict, Optional
import hashlib
class PineconeVectorStore:
def __init__(
self,
api_key: str,
index_name: str,
dimension: int = 1536,
metric: str = "cosine"
):
self.pc = Pinecone(api_key=api_key)
# Create index if not exists
if index_name not in self.pc.list_indexes().names():
self.pc.create_index(
name=index_name,
dimension=dimension,
metric=metri