model-selection

Automatically applies when choosing LLM models and providers. Ensures proper model comparison, provider selection, cost optimization, fallback patterns, and multi-model strategies.

1.25x

Quality

54%

Does it follow best practices?

Impact

93%

1.25x

Average score across 6 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./ai-llm/model-selection/SKILL.md

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description adequately communicates its domain and includes an explicit 'when' clause, which is a strength. However, the listed capabilities read more like high-level categories than concrete actions, and the trigger terms lack the natural language variations users would actually use when seeking help with model selection. It would benefit from more specific actions and richer keyword coverage.

Suggestions

Add more specific concrete actions, e.g., 'compare token pricing across providers, select optimal models for latency vs cost tradeoffs, configure retry and fallback chains'.

Include natural user trigger terms such as 'which model to use', 'OpenAI vs Anthropic', 'GPT', 'API costs', 'token pricing', 'rate limits', 'model benchmarks'.

Dimension	Reasoning	Score
Specificity	Names the domain (LLM models and providers) and lists some actions (model comparison, provider selection, cost optimization, fallback patterns, multi-model strategies), but these are more like category labels than concrete specific actions. For example, 'cost optimization' is vague compared to something like 'compare token pricing across providers' or 'calculate cost per request'.	2 / 3
Completeness	Clearly answers both 'what' (model comparison, provider selection, cost optimization, fallback patterns, multi-model strategies) and 'when' ('Automatically applies when choosing LLM models and providers'). The trigger condition is explicitly stated upfront.	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'LLM models', 'providers', 'cost optimization', 'fallback patterns', and 'multi-model strategies'. However, it misses many natural user terms like 'OpenAI', 'Anthropic', 'GPT', 'API', 'token pricing', 'rate limits', 'which model should I use', 'cheapest model', etc.	2 / 3
Distinctiveness Conflict Risk	The domain of LLM model/provider selection is reasonably specific, but terms like 'cost optimization' and 'multi-model strategies' could overlap with general architecture or infrastructure skills. The niche is identifiable but not sharply delineated with unique trigger terms.	2 / 3
	Total	9 / 12 Passed

Implementation

42%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides highly actionable, executable code covering model selection, routing, fallback, cost optimization, and ensembles. However, it is extremely verbose—most of the content is boilerplate class implementations that Claude can generate from concise patterns. The lack of progressive disclosure (no bundle files, everything inline) and absence of validation checkpoints in the workflow significantly weaken the skill's effectiveness as a context-window-efficient guide.

Suggestions

Extract the full class implementations (ModelRegistry, ModelRouter, FallbackChain, CostOptimizer, ModelEnsemble) into separate bundle files and replace them in SKILL.md with concise pattern descriptions and interface summaries.

Remove or drastically shorten docstrings, type hint explanations, and inline comments that Claude already understands—focus on the non-obvious design decisions and constraints.

Add explicit validation checkpoints to the workflow, e.g., 'Test fallback chain with simulated failures before deploying' and 'Verify routing rules cover all expected prompt categories'.

Move time-sensitive pricing data into a separate configuration file or note that prices should be verified at runtime, rather than hardcoding specific dollar amounts that will become stale.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~500+ lines. The ModelRegistry, ModelRouter, FallbackChain, CostOptimizer, and ModelEnsemble classes are fully spelled out with extensive docstrings, type hints, and inline comments that Claude already knows how to write. The pricing data is time-sensitive and will become stale. Much of this could be condensed to patterns and key interfaces rather than complete class implementations.	1 / 3
Actionability	All code is fully executable Python with complete class definitions, Pydantic models, type annotations, and usage examples. The code is copy-paste ready with concrete model IDs, pricing, and working routing/fallback/cost logic.	3 / 3
Workflow Clarity	The Auto-Apply section provides a 7-step sequence, but there are no validation checkpoints or feedback loops. For operations like model routing and fallback chains (which can fail silently or cascade), there's no guidance on verifying that routing rules work correctly or that fallback chains are tested before deployment.	2 / 3
Progressive Disclosure	The entire skill is a monolithic wall of code with no bundle files to offload detail into. The complete class implementations for ModelRegistry, ModelRouter, FallbackChain, CostOptimizer, and ModelEnsemble should be in separate reference files, with SKILL.md providing only the patterns, key interfaces, and navigation links.	1 / 3
	Total	7 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (713 lines); consider splitting into references/ and linking	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: majiayu000/claude-skill-registry-data
Commit: 2dfa65f

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.