Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.
80
71%
Does it follow best practices?
Impact
100%
1.56xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/vector-index-tuning/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines a specific technical niche. It uses precise domain terminology that practitioners would naturally use, provides both what the skill does and when to use it, and occupies a distinct space unlikely to overlap with other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: optimizing vector index performance across three dimensions (latency, recall, memory), tuning HNSW parameters, selecting quantization strategies, and scaling vector search infrastructure. | 3 / 3 |
Completeness | Clearly answers both 'what' (optimize vector index performance for latency, recall, and memory) and 'when' (explicit 'Use when' clause covering tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'vector index', 'HNSW parameters', 'quantization', 'vector search', 'latency', 'recall', 'memory'. These are terms practitioners naturally use when dealing with vector search optimization. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche targeting vector index optimization specifically. Terms like 'HNSW parameters', 'quantization strategies', and 'vector search infrastructure' are very specific and unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides highly actionable, executable code templates for vector index tuning but suffers from extreme verbosity—most of the implementation details (quantization algorithms, monitoring boilerplate) are things Claude already knows and don't need to be spelled out. The lack of a clear end-to-end workflow and the monolithic structure make it hard to navigate and use efficiently.
Suggestions
Reduce the skill to a concise overview with parameter recommendation tables and decision trees, moving full code templates to separate referenced files (e.g., templates/hnsw_benchmark.py, templates/quantization.py).
Add an explicit tuning workflow: benchmark current performance → identify bottleneck (latency/recall/memory) → select parameters → validate improvement → deploy, with checkpoints at each stage.
Remove implementation details Claude already knows (scalar quantization math, binary packing, KMeans-based PQ) and focus on the domain-specific knowledge: which parameters to tune, what tradeoffs to expect, and what thresholds indicate problems.
Add concrete validation criteria, e.g., 'If recall@10 drops below 0.90 after quantization, increase oversampling before switching quantization strategy.'
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines, with massive code templates that could be significantly condensed. Much of this (scalar quantization math, product quantization implementation, binary packing) is standard knowledge Claude already has. The memory estimation formulas and monitoring boilerplate add bulk without proportional value. | 1 / 3 |
Actionability | The code templates are fully executable with real libraries (hnswlib, qdrant_client, sklearn), include concrete parameter values, and are copy-paste ready. The benchmark functions, Qdrant configuration, and monitoring classes provide specific, actionable guidance. | 3 / 3 |
Workflow Clarity | While individual templates are clear, there's no overarching workflow connecting them (e.g., 'first benchmark → then select parameters → then deploy → then monitor'). There are no validation checkpoints or feedback loops for the tuning process—no guidance on what to do when recall drops or when to re-index. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of code. All four large templates are inline rather than being split into separate reference files. The skill would benefit enormously from a concise overview with links to detailed template files (e.g., HNSW_TUNING.md, QUANTIZATION.md, MONITORING.md). | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (524 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
47823e3
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.