Use when evaluating designs, reviewing code, or refactoring - measures success by total code in the final codebase, not effort to get there. Bias toward deletion.
47
48%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./reducing-entropy/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description reads more like a coding philosophy or principle ('bias toward deletion', 'measures success by total code') than a skill description with concrete capabilities. While it has a 'Use when...' clause, the triggers are too broad and would conflict with many other skills. The description lacks specific, actionable capabilities that would help Claude distinguish this skill from general code review or refactoring skills.
Suggestions
Add specific concrete actions the skill performs, e.g., 'Identifies and removes dead code, consolidates duplicate logic, simplifies over-engineered abstractions, and reduces overall codebase size.'
Narrow the trigger terms to be more distinctive, e.g., 'Use when the user asks to reduce code size, eliminate dead code, simplify implementations, or minimize codebase footprint.'
Clarify what makes this skill different from general code review — emphasize the minimalism/deletion focus with terms like 'code reduction', 'minimize LOC', 'remove unused code', 'simplify architecture'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description mentions 'evaluating designs, reviewing code, or refactoring' but these are broad categories, not concrete actions. It describes a philosophy ('bias toward deletion', 'measures success by total code') rather than specific capabilities like 'removes dead code, simplifies functions, consolidates duplicates'. | 1 / 3 |
Completeness | It has a 'Use when...' clause covering when to apply the skill (evaluating designs, reviewing code, refactoring), but the 'what does this do' part is weak — it describes a mindset/philosophy rather than concrete actions or outputs. The 'what' is only implied through the guiding principle of deletion bias. | 2 / 3 |
Trigger Term Quality | Contains some relevant keywords like 'reviewing code', 'refactoring', and 'designs' that users might naturally say. However, it's missing common variations like 'clean up code', 'simplify', 'reduce complexity', 'dead code', 'code smell', or 'technical debt'. | 2 / 3 |
Distinctiveness Conflict Risk | 'Reviewing code' and 'refactoring' are extremely broad terms that would overlap with many other skills like code review, linting, testing, or general coding assistance. The deletion bias philosophy is distinctive but not enough to prevent conflicts given the generic trigger terms. | 1 / 3 |
Total | 6 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-written, concise mindset/evaluation skill that effectively communicates a clear philosophy around code minimization. Its main strength is token efficiency and memorable heuristics (the 50-lines-deleting-200 example is excellent). Its weaknesses are the lack of concrete executable steps for actually measuring code reduction and the absence of explicit validation/feedback loops in the evaluation workflow.
Suggestions
Add a concrete measurement step, e.g., 'Run `wc -l` or `find . -name "*.py" | xargs wc -l` before and after to quantify the change' to improve actionability.
Add an explicit feedback loop: 'If the proposed change results in more total code → identify what else can be deleted to compensate, or reject the change' to strengthen workflow clarity.
Include at least one concrete before/after example showing a real refactoring (e.g., 14 functions → 2 functions) with line counts to make the guidance more actionable.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Every section earns its place. The content is lean, uses punchy examples (50 lines deleting 200 = net win), avoids explaining concepts Claude already knows, and uses formatting efficiently. The red flags section is particularly well-compressed. | 3 / 3 |
Actionability | The three questions framework and red flags provide concrete decision-making guidance, but the skill is more philosophical than executable. There are no concrete code examples, commands, or specific metrics for counting lines. The 'Before You Begin' section gives a clear procedure, but the core evaluation process lacks specific steps like 'run wc -l before and after'. | 2 / 3 |
Workflow Clarity | The 'Before You Begin' section provides a clear prerequisite sequence, and the three questions form a logical evaluation flow. However, there's no explicit validation checkpoint or feedback loop — e.g., no step saying 'if the proposed change increases lines, go back and try approach X.' The workflow is more of a checklist than a sequenced process with error recovery. | 2 / 3 |
Progressive Disclosure | The skill references `references/` directory and `adding-reference-mindsets.md` for deeper content, which is good structure. However, no bundle files were provided, so we can't verify these references exist. The references are one-level deep and clearly signaled, but the lack of specific filenames in the references directory (just 'list the files') makes navigation less concrete. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
40067f1
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.