Implement efficient similarity search with vector databases. Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance.
66
48%
Does it follow best practices?
Impact
100%
1.09xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/similarity-search-patterns/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has good structural completeness with an explicit 'Use when' clause and covers the core domain adequately. However, it lacks specific concrete actions (e.g., indexing strategies, embedding generation, specific database integrations) and misses several natural trigger terms users might use. The description is functional but could be more distinctive and specific.
Suggestions
Add specific concrete actions like 'create vector indexes, generate embeddings, configure HNSW/IVF parameters, tune recall-precision tradeoffs'
Expand trigger terms to include common user language like 'embeddings', 'vector store', 'FAISS', 'Pinecone', 'cosine similarity', 'RAG retrieval', 'ANN search'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (vector databases, similarity search) and some actions (implement, building, optimizing), but lacks specific concrete actions like 'create embeddings, configure HNSW indexes, tune recall/precision tradeoffs, integrate with FAISS/Pinecone/Weaviate'. | 2 / 3 |
Completeness | Clearly answers both what ('implement efficient similarity search with vector databases') and when ('Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance') with an explicit 'Use when' clause. | 3 / 3 |
Trigger Term Quality | Includes relevant terms like 'similarity search', 'vector databases', 'semantic search', 'nearest neighbor queries', and 'retrieval performance', but misses common user terms like 'embeddings', 'vector store', 'ANN', 'FAISS', 'Pinecone', 'cosine similarity', or 'RAG'. | 2 / 3 |
Distinctiveness Conflict Risk | The 'semantic search' and 'retrieval performance' triggers could overlap with general search optimization or RAG-related skills. The vector database focus provides some distinctiveness, but 'optimizing retrieval performance' is broad enough to conflict with other retrieval-related skills. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
29%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides high-quality, executable code templates for four vector database implementations, demonstrating strong actionability. However, it is severely bloated — dumping ~400 lines of repetitive class implementations into a single file without any workflow guidance, validation steps, or progressive disclosure. It reads more like a reference library than a skill that teaches Claude how to approach similarity search tasks.
Suggestions
Extract each database implementation into its own file (e.g., pinecone.md, qdrant.md, pgvector.md, weaviate.md) and replace inline code with a decision matrix and links to the appropriate template.
Add a clear workflow section: 1) Choose database based on constraints, 2) Set up index with appropriate parameters, 3) Ingest data in batches with validation, 4) Test recall/latency, 5) Tune parameters — with explicit validation checkpoints.
Remove the 'Core Concepts' section on distance metrics and index types, or reduce it to a one-line decision rule (e.g., 'Use cosine for normalized embeddings, HNSW index for most cases').
Add a concrete end-to-end usage example showing how to go from raw documents to working search results, rather than only presenting class definitions.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines with four full implementation templates (Pinecone, Qdrant, pgvector, Weaviate) that are largely repetitive in structure. The 'Core Concepts' section explains distance metrics and index types that Claude already knows. Much of this content should be in separate reference files or omitted entirely. | 1 / 3 |
Actionability | The code templates are fully executable, complete with imports, type hints, and concrete method implementations. Each template is copy-paste ready with real library APIs (Pinecone, Qdrant, asyncpg, Weaviate), including batch operations, filtering, and hybrid search patterns. | 3 / 3 |
Workflow Clarity | There is no workflow or sequencing guidance. The skill presents four independent class templates but never explains how to use them in a process — no steps for setting up, indexing data, validating results, or iterating on search quality. No validation checkpoints exist despite batch upsert operations being present. | 1 / 3 |
Progressive Disclosure | All four full implementation templates are inlined in a single monolithic file with no separation into referenced files. This is a textbook case of content that should be split — e.g., one file per database implementation — with the SKILL.md providing an overview and links. The external 'Resources' links point to vendor docs but don't organize the skill's own content. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (561 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
bbc5ade
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.