vector-index-tuning

Optimize vector index performance for latency, recall, and memory. Use when tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure.

1.56x

Quality

71%

Does it follow best practices?

Impact

100%

1.56x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/vector-index-tuning/SKILL.md

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong, well-crafted skill description that concisely covers specific capabilities, includes an explicit 'Use when' clause with domain-appropriate trigger terms, and occupies a clearly distinct niche. It uses proper third-person voice and avoids vague language or unnecessary verbosity.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'optimize vector index performance', 'tuning HNSW parameters', 'selecting quantization strategies', 'scaling vector search infrastructure'. These are concrete, actionable capabilities.	3 / 3
Completeness	Clearly answers both 'what' (optimize vector index performance for latency, recall, and memory) and 'when' (explicit 'Use when' clause covering tuning HNSW parameters, selecting quantization strategies, or scaling vector search infrastructure).	3 / 3
Trigger Term Quality	Includes strong natural keywords a user would use: 'vector index', 'latency', 'recall', 'memory', 'HNSW parameters', 'quantization', 'vector search'. These cover the domain well and match how practitioners talk about this topic.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche — vector index optimization with specific triggers like HNSW, quantization strategies, and vector search infrastructure are unlikely to conflict with other skills. This is a clearly defined technical domain.	3 / 3
	Total	12 / 12 Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides highly actionable, executable code for vector index tuning but suffers from extreme verbosity — it reads more like a library codebase than a concise skill guide. The lack of a clear end-to-end workflow and the monolithic structure make it hard to quickly extract the key decisions and steps needed for tuning. The core value (parameter tables, index selection guide, best practices) is buried under hundreds of lines of implementation code that Claude could generate from brief specifications.

Suggestions

Drastically reduce code templates to key patterns and parameter tables — Claude can generate full implementations from concise specifications like 'INT8 scalar quantization: scale to 0-255 range, store min/max/scale for dequantization'.

Add a clear sequential workflow: 1) Select index type based on data size → 2) Set initial HNSW params → 3) Benchmark with real queries → 4) Validate recall meets target → 5) Apply quantization if memory constrained → 6) Re-benchmark and verify recall.

Split into multiple files: keep SKILL.md as a concise overview with the decision tables and workflow, then reference separate files for code templates (e.g., templates/hnsw_benchmark.py, templates/quantization.py).

Add explicit validation checkpoints in the tuning workflow, e.g., 'If recall@10 < target after parameter change, revert and try increasing ef_search before reducing M.'

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~400+ lines. The code templates are extensive but much of this is boilerplate that Claude can generate on its own. The quantization implementations, monitoring classes, and Qdrant configuration could be dramatically condensed to key parameter tables and brief patterns. Explaining scalar quantization math and product quantization from scratch is unnecessary for Claude.	1 / 3
Actionability	The code is fully executable with concrete implementations using real libraries (hnswlib, qdrant_client, sklearn). Functions have proper type hints, realistic parameters, and are copy-paste ready. The benchmark functions, quantization strategies, and Qdrant configurations are all directly usable.	3 / 3
Workflow Clarity	The templates are presented as independent utilities rather than a coherent workflow. There's no clear sequence like 'first benchmark → then select parameters → then deploy → then monitor.' The index type selection table provides good decision guidance, but there are no validation checkpoints or feedback loops for the tuning process (e.g., 'if recall drops below target, adjust ef_search and re-benchmark').	2 / 3
Progressive Disclosure	This is a monolithic wall of code with no references to external files. The four large code templates (~300 lines of code) are all inline when they could be split into separate reference files. The skill would benefit enormously from a concise overview with links to detailed template files.	1 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (524 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: Dicklesworthstone/pi_agent_rust
Commit: b09ec7f

Reviewed: about 7 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.