Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.
View on GitHubFebruary 1, 2026
Select agents to install to:
npx add-skill https://github.com/wshobson/agents/blob/main/plugins/llm-application-dev/skills/vector-index-tuning/SKILL.md -a claude-code --skill vector-index-tuningInstallation paths:
.claude/skills/vector-index-tuning/# Vector Index Tuning
Guide to optimizing vector indexes for production performance.
## When to Use This Skill
- Tuning HNSW parameters
- Implementing quantization
- Optimizing memory usage
- Reducing search latency
- Balancing recall vs speed
- Scaling to billions of vectors
## Core Concepts
### 1. Index Type Selection
```
Data Size Recommended Index
────────────────────────────────────────
< 10K vectors → Flat (exact search)
10K - 1M → HNSW
1M - 100M → HNSW + Quantization
> 100M → IVF + PQ or DiskANN
```
### 2. HNSW Parameters
| Parameter | Default | Effect |
| ------------------ | ------- | ---------------------------------------------------- |
| **M** | 16 | Connections per node, ↑ = better recall, more memory |
| **efConstruction** | 100 | Build quality, ↑ = better index, slower build |
| **efSearch** | 50 | Search quality, ↑ = better recall, slower search |
### 3. Quantization Types
```
Full Precision (FP32): 4 bytes × dimensions
Half Precision (FP16): 2 bytes × dimensions
INT8 Scalar: 1 byte × dimensions
Product Quantization: ~32-64 bytes total
Binary: dimensions/8 bytes
```
## Templates
### Template 1: HNSW Parameter Tuning
```python
import numpy as np
from typing import List, Tuple
import time
def benchmark_hnsw_parameters(
vectors: np.ndarray,
queries: np.ndarray,
ground_truth: np.ndarray,
m_values: List[int] = [8, 16, 32, 64],
ef_construction_values: List[int] = [64, 128, 256],
ef_search_values: List[int] = [32, 64, 128, 256]
) -> List[dict]:
"""Benchmark different HNSW configurations."""
import hnswlib
results = []
dim = vectors.shape[1]
n = vectors.shape[0]
for m in m_values:
for ef_construction in ef_construction_values:
# Build index
index = hnswlib.Index(space='cosine', dim=dim)
i