Cost optimization patterns for LLM API usage — model routing by task complexity, budget tracking, retry logic, and prompt caching.
77
66%
Does it follow best practices?
Impact
100%
1.31xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/cost-aware-llm-pipeline/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is strong in specificity and distinctiveness, clearly listing concrete cost optimization patterns for LLM APIs. Its main weakness is the absence of an explicit 'Use when...' clause, which would help Claude know exactly when to select this skill. Trigger terms are decent but could include more natural user phrasings around cost reduction.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about reducing LLM API costs, optimizing token spend, or managing API budgets.'
Include more natural user-facing trigger terms like 'reduce API costs', 'save money on tokens', 'token usage optimization', or 'API pricing'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: model routing by task complexity, budget tracking, retry logic, and prompt caching. These are distinct, well-defined capabilities. | 3 / 3 |
Completeness | Clearly answers 'what does this do' with specific capabilities, but lacks an explicit 'Use when...' clause or equivalent trigger guidance, which caps this dimension at 2 per the rubric. | 2 / 3 |
Trigger Term Quality | Includes relevant terms like 'cost optimization', 'LLM API', 'model routing', 'budget tracking', 'retry logic', 'prompt caching', but misses common user phrasings like 'reduce API costs', 'save money on tokens', 'cheaper API calls', or 'token usage'. | 2 / 3 |
Distinctiveness Conflict Risk | The combination of 'cost optimization' + 'LLM API usage' with specific patterns like model routing and prompt caching creates a clear, distinct niche that is unlikely to conflict with other skills. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with executable code examples covering four composable cost-optimization patterns. Its main weaknesses are some redundancy between sections (When to Activate/When to Use, Best Practices/Anti-Patterns), missing validation feedback loops in the workflow, and the pricing table containing time-sensitive information that may become stale. The code quality is high and the composition pattern is well-demonstrated.
Suggestions
Remove the redundant 'When to Use' section since it overlaps heavily with 'When to Activate', and consolidate anti-patterns into the best practices section as brief 'avoid' notes to reduce token usage.
Add a validation/feedback step to the composition workflow — e.g., checking output quality and logging whether the model selection was appropriate, enabling threshold tuning over time.
Move the pricing table to a separate reference file (e.g., PRICING.md) since it contains time-sensitive data that will need updates, and link to it from the main skill.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient with good code examples, but includes some redundant sections ('When to Activate' and 'When to Use' overlap significantly), the anti-patterns section largely mirrors the best practices in negated form, and some explanatory comments are unnecessary for Claude (e.g., explaining what immutable means, why to use frozen dataclasses). | 2 / 3 |
Actionability | All four core patterns include fully executable Python code with concrete implementations. The composition section shows how to wire them together, and the pricing table provides specific numbers needed for cost calculations. Code is copy-paste ready. | 3 / 3 |
Workflow Clarity | The composition section shows a clear 4-step sequence with budget checking before API calls, which is good. However, there's no validation/verification step after processing (e.g., checking response quality to confirm model routing was appropriate), and no feedback loop for adjusting thresholds based on results. For a batch processing pipeline involving budget-sensitive operations, this is a gap. | 2 / 3 |
Progressive Disclosure | The content is well-structured with clear headers and logical sections, but it's a monolithic file with no references to external files for deeper content. The pricing table and anti-patterns could be split out. For a skill of this length (~150 lines of content), some progressive disclosure to separate files would improve organization. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
5df943e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.