Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.
72
—
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Guide to optimizing vector indexes for production performance.
Data Size Recommended Index
────────────────────────────────────────
< 10K vectors → Flat (exact search)
10K - 1M → HNSW
1M - 100M → HNSW + Quantization
> 100M → IVF + PQ or DiskANN| Parameter | Default | Effect |
|---|---|---|
| M | 16 | Connections per node, ↑ = better recall, more memory |
| efConstruction | 100 | Build quality, ↑ = better index, slower build |
| efSearch | 50 | Search quality, ↑ = better recall, slower search |
Full Precision (FP32): 4 bytes × dimensions
Half Precision (FP16): 2 bytes × dimensions
INT8 Scalar: 1 byte × dimensions
Product Quantization: ~32-64 bytes total
Binary: dimensions/8 bytesFull template library and detailed worked examples live in references/details.md. Read that file when you need the concrete templates.
5cc2549
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.