embedding-strategies

Select and optimize embedding models for semantic search and RAG applications. Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains.

1.65x

Quality

66%

Does it follow best practices?

Impact

91%

1.65x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/llm-application-dev/skills/embedding-strategies/SKILL.md

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates both what the skill does and when to use it, with good trigger terms covering the embedding/RAG domain. The main weakness is that the specificity of concrete actions could be improved—it mentions selecting and optimizing but doesn't enumerate specific techniques or outputs. Overall, it's well-structured and would perform reliably in skill selection.

Suggestions

Add more specific concrete actions such as 'compare model benchmarks, configure dimension sizes, evaluate retrieval accuracy, recommend chunk sizes and overlap settings' to improve specificity.

Dimension	Reasoning	Score
Specificity	Names the domain (embedding models, semantic search, RAG) and some actions (select, optimize, implement chunking strategies), but doesn't list multiple concrete specific actions like benchmarking, fine-tuning, dimension reduction, or specific model comparisons.	2 / 3
Completeness	Clearly answers both what ('Select and optimize embedding models for semantic search and RAG applications') and when ('Use when choosing embedding models, implementing chunking strategies, or optimizing embedding quality for specific domains') with explicit trigger guidance.	3 / 3
Trigger Term Quality	Includes strong natural keywords users would say: 'embedding models', 'semantic search', 'RAG', 'chunking strategies', 'embedding quality', 'domains'. These cover the main terms a user working in this space would naturally use.	3 / 3
Distinctiveness Conflict Risk	The focus on embedding models, chunking strategies, and RAG applications creates a clear niche that is unlikely to conflict with general ML skills, search skills, or other NLP-related skills. The triggers are specific to the embedding/retrieval domain.	3 / 3
	Total	11 / 12 Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides highly actionable, executable code templates covering a broad range of embedding use cases, which is its primary strength. However, it is excessively verbose for a skill file—most of this content (chunking algorithms, evaluation metrics, similarity computation) represents standard knowledge Claude already possesses. The monolithic structure with no file references and no validation checkpoints in workflows further weakens its effectiveness as a skill.

Suggestions

Reduce the SKILL.md to a concise overview with the model comparison table, one primary template (Voyage AI), and brief guidance on chunking strategy selection—move detailed templates to separate referenced files like CHUNKING.md, EVALUATION.md, LOCAL_MODELS.md.

Add explicit validation steps to the embedding pipeline workflow, such as checking embedding dimensions, verifying non-zero vectors, and spot-checking retrieval quality before bulk indexing.

Remove code that Claude can readily generate (e.g., the full recursive character splitter, NDCG/MRR implementations, cosine similarity computation) and replace with brief mentions or one-line references.

Structure the skill with a quick-start section of under 30 lines and use progressive disclosure to link to detailed templates only when needed.

Dimension	Reasoning	Score
Conciseness	The skill is extremely verbose at ~400+ lines with six large code templates, many of which Claude could generate on its own. The model comparison table, chunking strategies, and evaluation metrics are standard knowledge that don't need this level of detail. The content reads more like a tutorial than a concise skill reference.	1 / 3
Actionability	The code templates are fully executable, concrete, and copy-paste ready. Each template includes proper imports, type hints, and realistic usage patterns covering Voyage AI, OpenAI, local models, chunking, and evaluation.	3 / 3
Workflow Clarity	The embedding pipeline diagram shows a high-level flow, and Template 5 demonstrates a document processing pipeline with clear steps (preprocess → chunk → embed → store). However, there are no explicit validation checkpoints, error handling guidance, or feedback loops for when embeddings fail or produce poor quality results.	2 / 3
Progressive Disclosure	All content is inlined in a single monolithic file with no references to supporting files. The six large templates, evaluation code, and best practices could easily be split into separate referenced files. For a skill this large, the lack of any progressive disclosure structure is a significant weakness.	1 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (601 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: wshobson/agents
Commit: 34632bc

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.