CtrlK
BlogDocsLog inGet started
Tessl Logo

tdg-personal/context-budget

Audits Claude Code context window consumption across agents, skills, MCP servers, and rules. Identifies bloat, redundant components, and produces prioritized token-savings recommendations.

71

Quality

71%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

67%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is specific and distinctive, clearly articulating what the skill does and covering a well-defined niche. Its main weakness is the absence of an explicit 'Use when...' clause, which would help Claude know exactly when to select this skill. The trigger terms are somewhat technical and could benefit from including more natural user phrasings.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about context window usage, token consumption, context optimization, or wants to reduce token bloat in their Claude Code setup.'

Include more natural trigger term variations such as 'token usage', 'running out of context', 'context too large', 'optimize tokens', or 'reduce context size' to improve matching with how users naturally phrase these requests.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'audits context window consumption', 'identifies bloat, redundant components', and 'produces prioritized token-savings recommendations'. Also specifies the domains it operates across: agents, skills, MCP servers, and rules.

3 / 3

Completeness

Clearly answers 'what does this do' (audits context window consumption, identifies bloat, produces recommendations), but lacks an explicit 'Use when...' clause or equivalent trigger guidance, which per the rubric caps completeness at 2.

2 / 3

Trigger Term Quality

Includes some relevant terms like 'context window', 'token-savings', 'bloat', and 'MCP servers', but misses common natural language variations users might say such as 'token usage', 'context length', 'too many tokens', 'running out of context', or 'optimize context'. The terms lean somewhat technical.

2 / 3

Distinctiveness Conflict Risk

Highly distinctive niche — auditing Claude Code context window consumption is a very specific task unlikely to overlap with other skills. The combination of 'context window', 'token-savings', and the specific scope (agents, skills, MCP servers, rules) makes it clearly distinguishable.

3 / 3

Total

10

/

12

Passed

Implementation

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a well-structured conceptual framework for auditing context window consumption with clear phases, specific thresholds, and a useful output format. Its main weakness is the lack of executable code or concrete commands—it describes what to do at a high level but doesn't provide the actual implementation to scan directories, count tokens, or generate reports. The workflow clarity is strong but actionability suffers from being descriptive rather than prescriptive.

Suggestions

Add executable code snippets for the core operations: scanning directories, counting tokens (e.g., a shell one-liner or Python snippet for `words × 1.3`), and parsing .mcp.json for tool counts.

Remove the 'When to Use' section or reduce it to 1-2 lines—Claude can infer when to audit context budgets from the skill description alone.

Move the verbose mode report format and per-file breakdown details to a separate reference file to keep the main SKILL.md focused on the core workflow.

DimensionReasoningScore

Conciseness

The skill is reasonably well-structured but includes some unnecessary verbosity. The 'When to Use' section has 5 bullet points that could be trimmed, and some explanations (like what MCP tool schemas cost) are repeated in both the main body and Best Practices. The report template is useful but adds length. Overall mostly efficient with some tightening possible.

2 / 3

Actionability

The skill describes a clear process and provides specific thresholds (>200 lines, >30 words, ~500 tokens per tool, words × 1.3), but lacks executable code. There are no actual scripts or commands to run the inventory—it's a conceptual workflow rather than copy-paste-ready implementation. The examples show expected output format but not how to actually compute the values.

2 / 3

Workflow Clarity

The four-phase workflow (Inventory → Classify → Detect Issues → Report) is clearly sequenced with explicit criteria at each step. The classification table provides clear decision criteria, the detection phase lists specific problem patterns with thresholds, and the report phase shows exact output format. For an analytical/audit skill, this is well-structured with clear checkpoints.

3 / 3

Progressive Disclosure

The content is well-organized with clear sections and a logical flow, but it's all inline in a single file with no references to external files for detailed content. The verbose mode output format and the full report template could be split out. For a skill of this length (~120 lines), it's borderline acceptable but the report template and detailed examples add bulk that could be referenced.

2 / 3

Total

9

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Reviewed

Table of Contents