Build Retrieval-Augmented Generation (RAG) systems for LLM applications with vector databases and semantic search. Use when implementing knowledge-grounded AI, building document Q&A systems, or integrating LLMs with external knowledge bases.
69
62%
Does it follow best practices?
Impact
70%
2.12xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./tests/ext_conformance/artifacts/agents-wshobson/llm-application-dev/skills/rag-implementation/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly communicates the domain (RAG systems) and includes an explicit 'Use when' clause with relevant trigger scenarios. Its main weakness is that the 'what' portion is somewhat high-level—it says 'build RAG systems' without enumerating specific concrete actions like chunking, embedding, indexing, or retrieval pipeline configuration. The trigger terms are strong and natural.
Suggestions
Add more specific concrete actions to the 'what' portion, e.g., 'chunk documents, generate embeddings, configure retrieval pipelines, implement reranking, and manage vector store indexing'.
Narrow the scope slightly to reduce overlap risk—consider specifying frameworks or file types supported, or distinguishing from general LLM application building skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (RAG systems) and mentions some components (vector databases, semantic search), but doesn't list multiple concrete actions beyond 'build'. Missing specifics like chunking strategies, embedding generation, retrieval pipeline configuration, or reranking. | 2 / 3 |
Completeness | Clearly answers both what ('Build RAG systems with vector databases and semantic search') and when ('Use when implementing knowledge-grounded AI, building document Q&A systems, or integrating LLMs with external knowledge bases'). Has an explicit 'Use when' clause with multiple trigger scenarios. | 3 / 3 |
Trigger Term Quality | Good coverage of natural terms users would say: 'RAG', 'Retrieval-Augmented Generation', 'vector databases', 'semantic search', 'document Q&A', 'knowledge bases', 'LLM applications', 'knowledge-grounded AI'. These are terms users naturally use when seeking this capability. | 3 / 3 |
Distinctiveness Conflict Risk | While RAG is a specific niche, terms like 'LLM applications' and 'knowledge bases' are broad enough to potentially overlap with general LLM development skills, prompt engineering skills, or broader AI application building skills. The RAG-specific terms help but the scope is somewhat broad. | 2 / 3 |
Total | 10 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is a comprehensive RAG reference document with excellent, executable code examples, but it severely violates conciseness and progressive disclosure principles. It reads like a full tutorial rather than a lean skill file, explaining concepts Claude already knows and inlining hundreds of lines of configuration options that should be in separate reference files. The workflow lacks validation checkpoints for a multi-step process that can easily produce poor results.
Suggestions
Reduce the main SKILL.md to ~100 lines covering the core RAG pipeline (Quick Start + Best Practices), and move advanced patterns, vector store configs, chunking strategies, and evaluation code into separate bundle files with clear references.
Remove explanatory content Claude already knows: the 'When to Use This Skill' list, 'Purpose' descriptions for each component, and the options-listing sections. Instead, provide a single recommended stack with alternatives noted briefly.
Add explicit validation checkpoints to the workflow: e.g., 'Test retrieval quality before building the full pipeline' with a concrete snippet to verify retrieved docs are relevant.
Remove or condense the embedding model comparison table and vector DB options list—Claude knows these tools. Focus on the specific integration patterns and gotchas that are non-obvious.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Explains concepts Claude already knows (what vector databases are, what embeddings do, what RAG is). Lists every possible option (6 vector DBs, 6 embedding models, 4 retrieval strategies, 4 reranking methods) without clear guidance on when to use each. The 'When to Use This Skill' section is unnecessary padding. Much of this reads like a tutorial/reference doc rather than a lean skill. | 1 / 3 |
Actionability | The code examples are concrete, executable, and copy-paste ready. Every pattern includes complete, runnable Python code with proper imports, initialization, and usage. The examples cover the full pipeline from indexing to retrieval to generation. | 3 / 3 |
Workflow Clarity | The Quick Start shows a clear retrieve→generate pipeline via LangGraph, but there are no validation checkpoints. No guidance on verifying retrieval quality before proceeding, no error handling in the workflows, and no feedback loops for when retrieval fails or returns poor results. The evaluation section exists but is disconnected from the build workflow. | 2 / 3 |
Progressive Disclosure | Monolithic wall of content with no bundle files to offload detail. The vector store configurations, all 5 advanced patterns, all 4 chunking strategies, and evaluation code are all inline. This should be split into separate reference files (e.g., CHUNKING.md, VECTOR_STORES.md, PATTERNS.md) with the main skill providing a concise overview and pointers. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (571 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
bbc5ade
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.