Build Retrieval-Augmented Generation (RAG) systems for LLM applications with vector databases and semantic search. Use when implementing knowledge-grounded AI, building document Q&A systems, or integrating LLMs with external knowledge bases.
69
62%
Does it follow best practices?
Impact
70%
2.12xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/rag-implementation/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly identifies its domain (RAG systems) and provides an explicit 'Use when' clause with relevant trigger scenarios. Its main weakness is the lack of specific concrete actions beyond 'build' — listing discrete operations like chunking, embedding, indexing, and retrieval would strengthen specificity. The trigger terms are strong and natural for the target audience.
Suggestions
Add more specific concrete actions such as 'chunk documents, generate embeddings, index in vector stores, implement retrieval pipelines, rerank results' to improve specificity.
Narrow the scope slightly to reduce overlap risk — consider clarifying what distinguishes this from general LLM development or prompt engineering skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (RAG systems) and mentions some components (vector databases, semantic search), but doesn't list multiple concrete actions beyond 'build'. Lacks specifics like 'chunk documents, generate embeddings, query vector stores, rerank results'. | 2 / 3 |
Completeness | Clearly answers both 'what' (build RAG systems with vector databases and semantic search) and 'when' (explicit 'Use when' clause covering knowledge-grounded AI, document Q&A systems, and integrating LLMs with external knowledge bases). | 3 / 3 |
Trigger Term Quality | Good coverage of natural terms users would say: 'RAG', 'Retrieval-Augmented Generation', 'vector databases', 'semantic search', 'document Q&A', 'knowledge bases', 'LLM applications'. These are terms developers naturally use when seeking this capability. | 3 / 3 |
Distinctiveness Conflict Risk | While RAG is a specific niche, terms like 'LLM applications' and 'knowledge bases' are broad enough to potentially overlap with general LLM skills, prompt engineering skills, or broader AI development skills. The RAG-specific terms help but the scope is somewhat wide. | 2 / 3 |
Total | 10 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is a comprehensive RAG reference guide with excellent, executable code examples covering many patterns and configurations. However, it is far too verbose for a SKILL.md—it reads like a tutorial or documentation page rather than a concise skill file. It would benefit enormously from splitting into a lean overview with references to detailed sub-files, and from adding validation/verification steps in its workflows.
Suggestions
Reduce the SKILL.md to a concise overview (~100 lines) with the Quick Start example and brief pattern descriptions, moving vector store configs, chunking strategies, advanced patterns, and evaluation code into separate referenced files (e.g., VECTOR_STORES.md, CHUNKING.md, ADVANCED_PATTERNS.md, EVALUATION.md).
Remove catalog-style listings that Claude already knows (e.g., the embedding models table, vector DB options list, retrieval strategy descriptions) or move them to a reference file.
Add explicit validation checkpoints: verify retrieval returns results before generation, check embedding dimensions match index configuration, validate chunk sizes produce expected output counts.
Add error handling in code examples for common failure modes (empty retrieval results, API connection failures, dimension mismatches) to improve workflow robustness.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This is extremely verbose at ~400+ lines. It explains concepts Claude already knows (what vector databases are, what embeddings do, what RAG is), lists every possible option for each component (6 vector DBs, 6 embedding models, 4 retrieval strategies, 4 reranking methods), and includes extensive configuration examples for multiple vector stores that could be in separate reference files. Much of this is catalog-style information that doesn't earn its token cost. | 1 / 3 |
Actionability | The code examples are concrete, executable, and copy-paste ready. Every pattern includes complete, runnable Python code with proper imports, initialization, and usage. The Quick Start section provides a full working LangGraph RAG pipeline. | 3 / 3 |
Workflow Clarity | The Quick Start provides a clear retrieve→generate pipeline, and the advanced patterns show clear sequences. However, there are no validation checkpoints—no steps to verify embeddings were created correctly, no checks that retrieval is returning relevant results before proceeding to generation, and no error handling or feedback loops for common failure modes like empty retrieval results. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of content with no references to separate files. The vector store configurations, chunking strategies, advanced patterns, and evaluation metrics could each be their own reference files. Everything is inlined into a single massive document with no progressive disclosure structure—just a flat list of sections. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (571 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
6e3d68c
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.