Systematically review and improve every shell feature and builtin command. Iterates through each feature/command, runs code-review, fixes issues, and re-reviews until clean.
66
52%
Does it follow best practices?
Impact
91%
1.56xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/improve-loop/SKILL.mdQuality
Discovery
50%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description communicates a reasonable sense of what the skill does—iteratively reviewing and improving shell builtins—but lacks explicit trigger guidance ('Use when...'), concrete action details, and natural keyword variations. It would benefit from specifying the types of issues it fixes and when Claude should choose this skill over a general code review skill.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to audit, review, or improve shell builtin implementations or shell feature code.'
Include natural trigger term variations such as 'bash', 'zsh', 'shell script', 'builtin', 'lint', 'audit shell code'.
Specify concrete actions more precisely, e.g., 'checks for POSIX compliance, error handling, edge cases in argument parsing' instead of the vague 'fixes issues'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (shell features and builtin commands) and describes a process (iterate, run code-review, fix issues, re-review), but the actions are somewhat generic—'fixes issues' and 'runs code-review' lack concrete detail about what kinds of issues or what the review entails. | 2 / 3 |
Completeness | Describes what it does (review and improve shell features/builtins via iterative code review), but lacks an explicit 'Use when...' clause or equivalent trigger guidance for when Claude should select this skill. | 2 / 3 |
Trigger Term Quality | Includes some relevant terms like 'shell', 'builtin command', 'code-review', but misses common user-facing variations (e.g., 'bash', 'zsh', 'shell script', 'lint', 'refactor'). The phrase 'systematically review' is somewhat natural but not a typical user trigger. | 2 / 3 |
Distinctiveness Conflict Risk | The focus on shell builtins specifically is somewhat distinctive, but 'code-review' and 'fix issues' are generic enough to overlap with general code review or linting skills. The iterative review process adds some distinction but not enough to be clearly unique. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is extremely thorough and actionable with an exemplary workflow structure including gate checks, parallel execution patterns, validation loops, and clear decision points. However, it is severely over-long and monolithic — the ~500+ line body tries to encode an entire code review methodology, security checklist, and project management protocol in a single file, wasting significant context window budget. The content would benefit enormously from splitting the review dimensions, output formats, and agent instructions into separate referenced files.
Suggestions
Extract the review dimensions (A through K) and pentest checks into a separate REVIEW_DIMENSIONS.md file, referencing it from the main skill with a one-line summary of each dimension.
Extract the agent launch instructions and output format into a separate AGENT_INSTRUCTIONS.md template file, reducing the main skill by ~40%.
Remove redundant reminders (e.g., 'STOP — READ THIS BEFORE DOING ANYTHING ELSE', repeated 'Completion check' blocks, 'Never skip steps' section) — Claude can follow sequential instructions without being told multiple times not to skip them.
Condense the bash command examples by removing comments that explain obvious operations (e.g., '# If argument provided, use it; otherwise detect from current branch') and combining related commands.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines. Massive amounts of detail that Claude could infer or that repeat themselves (e.g., the review dimensions are exhaustively spelled out with sub-bullets that an experienced code reviewer would know). The step-by-step task creation protocol, gate checks, and repeated 'completion check' reminders add significant token overhead. Many sections could be condensed by 50-70%. | 1 / 3 |
Actionability | Highly actionable with specific bash commands, exact tool invocations (gh pr comment, go test, gofmt), concrete output formats, specific file paths, and copy-paste ready code blocks throughout. The review dimensions provide precise criteria with specific function names and patterns to check for. | 3 / 3 |
Workflow Clarity | Exceptionally clear multi-step workflow with explicit sequencing (Step 1 → Step 2 loop → Step 3 → Step 4), gate checks between steps, validation checkpoints (run tests after every fix, re-run on failure with max 3 attempts), feedback loops (fix → test → retry), and clear decision tables for loop continuation. The ASCII flow diagram and decision table are excellent. | 3 / 3 |
Progressive Disclosure | Monolithic wall of text with no references to external files for detailed content. The entire review rubric (dimensions A through K), all workflow steps, all bash commands, and all output formats are inlined in a single massive document. The review instructions for agents alone could be a separate reference file. No bundle files are provided to offload content to. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (552 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
00bdc03
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.