similarity-search-patterns

Implement efficient similarity search with vector databases. Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance.

1.38x

Quality

48%

Does it follow best practices?

Impact

100%

1.38x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/llm-application-dev/skills/similarity-search-patterns/SKILL.md

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description has a solid structure with an explicit 'Use when' clause and covers the core domain adequately. However, it could be more specific about concrete actions (e.g., configuring indexes, choosing distance metrics, managing embeddings) and include more natural trigger terms that users commonly use when working with vector databases. The description risks some overlap with general search or RAG-related skills.

Suggestions

Add more specific concrete actions like 'configure vector indexes, choose distance metrics, store and query embeddings, benchmark search latency'

Include additional natural trigger terms users would say, such as 'embeddings', 'vector store', 'FAISS', 'Pinecone', 'ChromaDB', 'ANN search', or 'cosine similarity'

Dimension	Reasoning	Score
Specificity	Names the domain (vector databases, similarity search) and some actions (building semantic search, implementing nearest neighbor queries, optimizing retrieval performance), but doesn't list multiple concrete specific actions like indexing strategies, embedding storage, or specific database operations.	2 / 3
Completeness	Clearly answers both 'what' (implement efficient similarity search with vector databases) and 'when' (explicit 'Use when' clause covering semantic search, nearest neighbor queries, and retrieval optimization).	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'similarity search', 'vector databases', 'semantic search', 'nearest neighbor', and 'retrieval performance', but misses common user terms like 'embeddings', 'vector store', 'FAISS', 'Pinecone', 'ChromaDB', 'ANN', or 'vector index'.	2 / 3
Distinctiveness Conflict Risk	'Semantic search' and 'retrieval performance' could overlap with general search/information retrieval skills or RAG-focused skills. The vector database focus provides some distinction, but 'optimizing retrieval performance' is broad enough to conflict with other skills.	2 / 3
	Total	9 / 12 Passed

Implementation

29%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is essentially a code reference catalog for four vector database implementations, dumped into a single massive file. While the code itself is high-quality and executable, the skill fails at conciseness (massive token footprint with redundant patterns), workflow clarity (no sequencing, decision guidance, or validation), and progressive disclosure (everything inline with no file structure). It would benefit greatly from restructuring into a concise overview with separate template files.

Suggestions

Split each implementation template into its own file (e.g., pinecone.md, qdrant.md, pgvector.md, weaviate.md) and keep SKILL.md as a concise overview with a decision matrix for choosing between them.

Add a clear workflow: 1) Choose distance metric based on embedding type, 2) Select vector DB based on scale/requirements, 3) Implement with template, 4) Validate recall with test queries, 5) Tune index parameters.

Remove the Core Concepts section (distance metrics, index types) — Claude already knows these. Replace with a brief decision table mapping use cases to recommended implementations.

Add validation/verification steps such as testing search recall against known-good queries, checking index build completion, and verifying similarity score distributions.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~400+ lines with four full implementation templates (Pinecone, Qdrant, pgvector, Weaviate) that are largely boilerplate wrapper classes. The distance metrics table and index types overview explain concepts Claude already knows. Much of this could be condensed or split into separate reference files.	1 / 3
Actionability	The code templates are fully executable, complete with imports, type hints, and concrete method implementations. Each template is copy-paste ready and covers upsert, search, hybrid search, and filtering patterns with real library APIs.	3 / 3
Workflow Clarity	There is no clear workflow or sequencing for when/how to use these templates. No validation steps, no guidance on choosing between implementations, no error handling patterns, and no verification checkpoints. The skill reads as a reference catalog rather than a guided process.	1 / 3
Progressive Disclosure	All content is dumped into a single monolithic file with no references to external files. Four complete implementation templates (~300 lines of code) should be split into separate files with the SKILL.md providing an overview and navigation. No bundle files exist to support this.	1 / 3
	Total	6 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (554 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: wshobson/agents
Commit: 34632bc

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.