Skill de recomendacao de nivel de modelo LLM por tipo de tarefa e complexidade. Sugere qual nivel usar e, se o ambiente suportar, pode sugerir acao manual correspondente. Nao troca o modelo programaticamente. Use quando precisar balancear custo, latencia e profundidade de raciocinio.
61
51%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/16-llm-selector/SKILL.mdQuality
Discovery
52%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description adequately covers both what the skill does and when to use it, earning good marks on completeness. However, it suffers from a lack of natural trigger terms that users would actually say when needing model selection help, and the specificity of concrete actions is limited mostly to 'suggests'. The use of Portuguese and technical jargon further limits discoverability.
Suggestions
Add natural trigger terms users would say, such as 'model selection', 'which model to use', 'cheaper model', 'faster model', 'switch model', 'model routing', 'pick the right model'.
List more specific concrete actions, e.g., 'Recommends optimal LLM tier (e.g., haiku, sonnet, opus) based on task complexity, estimates cost savings, and provides manual switching instructions.'
Broaden the 'Use when' clause with more user-facing scenarios, e.g., 'Use when the user asks which model to pick, wants to reduce API costs, needs faster responses, or is deciding between model tiers.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (LLM model level recommendation by task type and complexity) and some actions (suggests which level to use, can suggest corresponding manual action), but doesn't list multiple concrete specific actions beyond 'suggest'. | 2 / 3 |
Completeness | The description answers both 'what' (recommends LLM model level by task type and complexity, suggests manual actions) and 'when' explicitly ('Use quando precisar balancear custo, latencia e profundidade de raciocinio'). | 3 / 3 |
Trigger Term Quality | The description lacks natural keywords users would actually say. Terms like 'nivel de modelo LLM', 'complexidade', 'custo, latencia e profundidade de raciocinio' are technical jargon. Users would more likely say things like 'which model should I use', 'cheaper model', 'faster model', 'model selection', etc. | 1 / 3 |
Distinctiveness Conflict Risk | The skill has a somewhat specific niche (LLM model recommendation), but the 'when' trigger about balancing cost, latency, and reasoning depth is broad enough that it could overlap with general optimization or architecture skills. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides a reasonable framework for LLM level recommendation with useful decision tables and output format, but lacks concrete worked examples showing the recommendation process end-to-end. The workflow is implicit rather than explicitly sequenced, and there is some redundancy between sections (Saidas Esperadas vs Evidencia de Conclusao). Adding 1-2 concrete examples and an explicit step-by-step decision flow would significantly improve actionability and clarity.
Suggestions
Add 1-2 concrete worked examples showing input (task type, complexity, risk) mapped to output (recommendation in the specified format) to improve actionability.
Convert the implicit decision process into an explicit numbered workflow: identify task → assess complexity → check upgrade/downgrade triggers → emit recommendation.
Consolidate 'Saidas Esperadas' and 'Evidencia de Conclusao' sections to reduce redundancy, as they largely overlap.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably structured but includes some redundancy — 'Evidencia de Conclusao' largely repeats 'Saidas Esperadas', and sections like 'Quando Nao Usar' add marginal value. The governance/handoff references are brief but add tokens without actionable content for this skill. | 2 / 3 |
Actionability | The skill provides a clear output format template and a decision table for level selection, which is concrete. However, it lacks worked examples showing input→output (e.g., 'Given task X with complexity Y, recommend Z'), making it harder to follow precisely. The guidance is more descriptive than fully executable. | 2 / 3 |
Workflow Clarity | The skill describes what to consider (upgrade/downgrade rules, skill mappings) but doesn't present a clear sequential workflow: e.g., Step 1: identify task type, Step 2: assess complexity, Step 3: check upgrade/downgrade triggers, Step 4: emit recommendation. The process is implicit rather than explicitly sequenced. | 2 / 3 |
Progressive Disclosure | References to external policies (GLOBAL.md, policies/execution.md, etc.) provide some progressive disclosure, but the skill itself is a single monolithic file with no clear pointers to deeper reference material specific to this skill's domain. The references are governance-related rather than content-related. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
4dee3f0
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.