Implement efficient similarity search with vector databases. Use when building semantic search, implementing nearest neighbor queries, or optimizing retrieval performance.
66
48%
Does it follow best practices?
Impact
100%
1.09xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/similarity-search-patterns/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has a solid structure with an explicit 'Use when' clause and covers the core domain adequately. However, it lacks the specificity of concrete actions (e.g., index creation, embedding management, query optimization techniques) and misses common trigger terms users would naturally use like 'embeddings', 'vector store', or specific tool names. The description is functional but could be more distinctive and detailed.
Suggestions
Add more specific concrete actions such as 'create and manage vector indexes, store embeddings, tune ANN parameters, benchmark query latency'
Expand trigger terms to include common user vocabulary like 'embeddings', 'vector store', 'FAISS', 'Pinecone', 'ChromaDB', 'cosine similarity', 'RAG retrieval'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (vector databases, similarity search) and some actions (building semantic search, implementing nearest neighbor queries, optimizing retrieval performance), but doesn't list multiple concrete specific actions like indexing strategies, embedding storage, query tuning, or specific database operations. | 2 / 3 |
Completeness | Clearly answers both 'what' (implement efficient similarity search with vector databases) and 'when' (explicit 'Use when' clause covering semantic search, nearest neighbor queries, and retrieval performance optimization). | 3 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'similarity search', 'vector databases', 'semantic search', 'nearest neighbor', and 'retrieval performance', but misses common user terms like 'embeddings', 'vector store', 'FAISS', 'Pinecone', 'ChromaDB', 'ANN', 'cosine similarity', or 'RAG'. | 2 / 3 |
Distinctiveness Conflict Risk | The terms 'semantic search' and 'retrieval performance' could overlap with general search/information retrieval skills or RAG pipeline skills. The vector database focus provides some distinction, but 'optimizing retrieval performance' is broad enough to conflict with other skills. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
29%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides high-quality, executable code templates for four vector databases, demonstrating strong actionability. However, it is severely bloated—four near-identical class implementations are inlined rather than split into referenced files, and the content includes conceptual explanations Claude doesn't need. It also lacks any workflow guidance, decision framework for choosing between databases, or validation/verification steps.
Suggestions
Move each database template (Pinecone, Qdrant, pgvector, Weaviate) into separate referenced files and keep only a concise comparison table and quick-start example in the main SKILL.md.
Remove the 'Core Concepts' section on distance metrics and index types—Claude already knows these. Replace with a brief decision matrix (e.g., 'Use cosine for normalized embeddings, L2 for raw').
Add a clear workflow: 1) Choose database based on constraints, 2) Initialize with schema, 3) Upsert with batch verification, 4) Validate search quality with test queries, 5) Tune parameters based on recall metrics.
Add validation checkpoints such as verifying index creation, checking upsert counts match expected, and testing search recall before deploying to production.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines with four full implementation templates that are largely repetitive (each showing upsert/search/hybrid_search patterns). The 'Core Concepts' section explains distance metrics and index types that Claude already knows. The content could be reduced by 60-70% without losing actionable value. | 1 / 3 |
Actionability | The code templates are fully executable, complete with imports, type hints, and realistic implementations for Pinecone, Qdrant, pgvector, and Weaviate. They are copy-paste ready and cover common operations like upsert, search, filtered search, and hybrid search. | 3 / 3 |
Workflow Clarity | There is no clear workflow or sequencing for how to implement similarity search end-to-end. The templates are standalone class definitions with no guidance on when to choose one over another, no validation steps (e.g., verifying index creation succeeded, checking search quality), and no error handling or feedback loops for batch upsert operations. | 1 / 3 |
Progressive Disclosure | This is a monolithic wall of code with four full implementation templates inlined. Each template (Pinecone, Qdrant, pgvector, Weaviate) should be in separate referenced files. The skill would benefit enormously from a concise overview with links to per-database implementation files. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (561 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
47823e3
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.