Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.
80
71%
Does it follow best practices?
Impact
100%
1.56xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/vector-index-tuning/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, well-crafted skill description that concisely covers specific capabilities, includes an explicit 'Use when' clause with domain-appropriate trigger terms, and occupies a clearly distinct niche. It uses proper third-person voice and avoids vague language or unnecessary verbosity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'optimize vector index performance', 'tuning HNSW parameters', 'selecting quantization strategies', 'scaling vector search infrastructure'. These are concrete, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both 'what' (optimize vector index performance for latency, recall, and memory) and 'when' (explicit 'Use when' clause covering tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure). | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords a user would use: 'vector index', 'latency', 'recall', 'memory', 'HNSW parameters', 'quantization', 'vector search'. These cover the domain well and match how practitioners talk about this topic. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche — vector index optimization with specific triggers like HNSW, quantization strategies, and vector search infrastructure are unlikely to conflict with other skills. This is a clearly defined technical domain. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides highly actionable, executable code for vector index tuning but suffers from extreme verbosity — it reads more like a library codebase than a concise skill guide. The lack of a clear end-to-end workflow and the monolithic structure make it hard to quickly extract the key decisions and steps needed for tuning. The core value (parameter tables, index selection guide, best practices) is buried under hundreds of lines of implementation code that Claude could generate from brief specifications.
Suggestions
Drastically reduce code templates to key patterns and parameter tables — Claude can generate full implementations from concise specifications like 'INT8 scalar quantization: scale to 0-255 range, store min/max/scale for dequantization'.
Add a clear sequential workflow: 1) Select index type based on data size → 2) Set initial HNSW params → 3) Benchmark with real queries → 4) Validate recall meets target → 5) Apply quantization if memory constrained → 6) Re-benchmark and verify recall.
Split into multiple files: keep SKILL.md as a concise overview with the decision tables and workflow, then reference separate files for code templates (e.g., templates/hnsw_benchmark.py, templates/quantization.py).
Add explicit validation checkpoints in the tuning workflow, e.g., 'If recall@10 < target after parameter change, revert and try increasing ef_search before reducing M.'
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. The code templates are extensive but much of this is boilerplate that Claude can generate on its own. The quantization implementations, monitoring classes, and Qdrant configuration could be dramatically condensed to key parameter tables and brief patterns. Explaining scalar quantization math and product quantization from scratch is unnecessary for Claude. | 1 / 3 |
Actionability | The code is fully executable with concrete implementations using real libraries (hnswlib, qdrant_client, sklearn). Functions have proper type hints, realistic parameters, and are copy-paste ready. The benchmark functions, quantization strategies, and Qdrant configurations are all directly usable. | 3 / 3 |
Workflow Clarity | The templates are presented as independent utilities rather than a coherent workflow. There's no clear sequence like 'first benchmark → then select parameters → then deploy → then monitor.' The index type selection table provides good decision guidance, but there are no validation checkpoints or feedback loops for the tuning process (e.g., 'if recall drops below target, adjust ef_search and re-benchmark'). | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of code with no references to external files. The four large code templates (~300 lines of code) are all inline when they could be split into separate reference files. The skill would benefit enormously from a concise overview with links to detailed template files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (524 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
b09ec7f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.