Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.
75
66%
Does it follow best practices?
Impact
91%
1.65xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/llm-application-dev/skills/embedding-strategies/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly communicates both what the skill does and when to use it, with good trigger terms specific to the embedding/RAG domain. The main weakness is that the capability description could be more specific about the concrete actions performed (e.g., comparing models, configuring chunk sizes, evaluating retrieval quality). Overall it performs well across dimensions.
Suggestions
Add more specific concrete actions such as 'compare embedding model benchmarks, configure chunk sizes and overlap, evaluate retrieval quality metrics' to improve specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (embedding models, semantic search, RAG) and some actions (select, optimize, implement chunking strategies), but doesn't list multiple concrete specific actions like benchmarking, fine-tuning, dimension reduction, or specific model comparisons. | 2 / 3 |
Completeness | Clearly answers both what ('Select and optimize embedding models for semantic search and RAG applications') and when ('Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains') with explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'embedding models', 'semantic search', 'RAG', 'chunking strategies', 'embedding quality', 'domains'. These cover the main terms a user working in this space would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | The focus on embedding models, chunking strategies, and RAG applications creates a clear niche that is unlikely to conflict with general ML skills, search skills, or other NLP-related skills. The triggers are specific to the embedding/retrieval domain. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides highly actionable, executable code templates covering a broad range of embedding use cases, which is its primary strength. However, it is excessively verbose for a SKILL.md file—most of the code is standard library usage that Claude can generate on demand. The lack of progressive disclosure (everything in one file) and missing validation checkpoints in workflows significantly reduce its effectiveness as a skill document.
Suggestions
Split the six code templates into separate reference files (e.g., VOYAGE.md, OPENAI.md, CHUNKING.md, EVALUATION.md) and keep only the model comparison table, pipeline diagram, and brief quick-start example in SKILL.md.
Reduce code templates to minimal differentiating snippets—e.g., just show the Voyage AI query prefix pattern and BGE prefix requirement rather than full class implementations Claude can write itself.
Add explicit validation steps to the workflow: e.g., 'After generating embeddings, verify dimensionality matches expected model output; spot-check a few query-document pairs for reasonable similarity scores before bulk indexing.'
Remove boilerplate code like the full recursive_character_splitter implementation (available in LangChain) and the complete evaluation metrics suite, replacing with brief references to existing libraries.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines, with extensive code templates that cover multiple embedding providers, chunking strategies, evaluation metrics, and code-specific pipelines. Much of this is library boilerplate Claude already knows how to write. The model comparison table and do's/don'ts lists add some value but the overall token cost is very high relative to the unique knowledge conveyed. | 1 / 3 |
Actionability | The code templates are fully executable with concrete implementations for Voyage AI, OpenAI, sentence-transformers, multiple chunking strategies, and evaluation metrics. They include proper imports, type hints, and are copy-paste ready. | 3 / 3 |
Workflow Clarity | The embedding pipeline diagram shows a high-level flow, and Template 5 provides a multi-step document processing pipeline. However, there are no explicit validation checkpoints—no steps to verify embedding quality after generation, no error handling guidance for API failures, and no feedback loops for when chunking produces poor results. | 2 / 3 |
Progressive Disclosure | All content is inlined in a single monolithic file with no references to external files. The six templates, evaluation code, and best practices could easily be split into separate reference files (e.g., CHUNKING.md, EVALUATION.md, MODELS.md) with a concise overview in the main skill. The current structure is a wall of code. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (601 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
91fe43e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.