similarity-search-patterns

Implement efficient similarity search with vector databases. Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance.

1.09x

Quality

48%

Does it follow best practices?

Impact

100%

1.09x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/similarity-search-patterns/SKILL.md

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is competent with a clear 'Use when' clause and reasonable domain specificity around vector databases and similarity search. However, it could benefit from more concrete actions (e.g., indexing vectors, configuring distance metrics) and broader trigger term coverage including common tool names and user-facing synonyms. The distinctiveness is moderate—it could overlap with general search or RAG-related skills.

Suggestions

Add more specific concrete actions like 'index embeddings, configure distance metrics, benchmark query latency, integrate with FAISS/Pinecone/Weaviate'.

Expand trigger terms to include common user variations such as 'embeddings', 'vector store', 'ANN', 'cosine similarity', 'RAG retrieval', and specific library/service names.

Dimension	Reasoning	Score
Specificity	Names the domain (vector databases, similarity search) and some actions (building semantic search, implementing nearest neighbor queries, optimizing retrieval performance), but doesn't list multiple concrete granular actions like specific operations (e.g., indexing, querying, configuring distance metrics, managing embeddings).	2 / 3
Completeness	Clearly answers both 'what' (implement efficient similarity search with vector databases) and 'when' (explicit 'Use when' clause covering semantic search, nearest neighbor queries, and retrieval performance optimization).	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'similarity search', 'vector databases', 'semantic search', 'nearest neighbor', and 'retrieval performance', but misses common user variations like 'embeddings', 'vector store', 'FAISS', 'Pinecone', 'cosine similarity', 'ANN', 'vector index', or 'RAG'.	2 / 3
Distinctiveness Conflict Risk	The terms 'semantic search' and 'retrieval performance' could overlap with general search/information retrieval skills or RAG pipeline skills. The vector database focus provides some distinction, but 'optimizing retrieval performance' is broad enough to conflict with other skills.	2 / 3
	Total	9 / 12 Passed

Implementation

29%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is a verbose reference dump of four vector database implementations with good code quality but poor structure and workflow guidance. It fails to respect token budget by including ~400 lines of boilerplate that Claude could generate from minimal guidance, and lacks any sequenced workflow for actually building a similarity search system. The content would benefit enormously from being condensed to key patterns and decision criteria, with individual implementations split into separate files.

Suggestions

Reduce to one canonical implementation (e.g., pgvector) inline with a decision matrix for choosing between databases, and move other implementations to separate referenced files.

Add a clear end-to-end workflow: choose metric → create index → embed documents → upsert → search → evaluate recall → tune parameters, with validation checkpoints at each step.

Remove the Core Concepts section (distance metrics, index types) — Claude already knows these. Replace with a concise decision table mapping use cases to specific configurations.

Add explicit validation steps such as verifying index creation succeeded, checking upsert counts, and measuring recall against a test set before deploying.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~400+ lines with four near-complete class implementations for different vector databases. Much of this is boilerplate code Claude could generate on demand. The distance metrics table, index types ASCII art, and conceptual sections explain things Claude already knows. This could be reduced to a fraction of its size by showing one canonical pattern and referencing others.	1 / 3
Actionability	The code templates are fully executable with proper imports, type hints, and complete method implementations for Pinecone, Qdrant, pgvector, and Weaviate. The examples are copy-paste ready with concrete API calls and realistic patterns like hybrid search and reranking.	3 / 3
Workflow Clarity	There is no clear workflow or sequencing for how to actually implement similarity search end-to-end. The skill presents isolated class templates without guidance on when to use which, how to evaluate results, or validation steps. No feedback loops for verifying search quality or handling failures during indexing or querying.	1 / 3
Progressive Disclosure	All content is dumped into a single monolithic file with four complete class implementations inline. There are no references to separate files for individual database implementations, and the external resource links at the bottom are just generic documentation URLs rather than structured bundle references. The content desperately needs splitting.	1 / 3
	Total	6 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (561 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: Dicklesworthstone/pi_agent_rust
Commit: b09ec7f

Reviewed: about 8 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.