Improve test coverage for shell features and commands using reference test suites from yash, GNU coreutils, and uutils/coreutils
44
47%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/improve-test-coverage/SKILL.mdQuality
Discovery
40%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description identifies a clear and distinctive niche (shell test coverage using specific reference test suites), which reduces conflict risk. However, it lacks explicit trigger guidance ('Use when...'), lists only one vague action ('improve test coverage'), and misses common natural language terms users might employ when requesting this kind of help.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to add, port, or improve tests for shell builtins, coreutils commands, or POSIX compliance.'
List more specific concrete actions, e.g., 'Ports reference tests from yash/GNU coreutils/uutils test suites, identifies coverage gaps, writes new test cases for shell builtins and commands.'
Include additional natural trigger terms like 'unit tests', 'bash', 'POSIX', 'shell scripts', 'coreutils testing', 'test porting' to improve keyword coverage.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (test coverage for shell features/commands) and references specific test suites (yash, GNU coreutils, uutils/coreutils), but doesn't list concrete actions beyond 'improve test coverage' — e.g., it doesn't specify writing tests, porting tests, analyzing coverage gaps, etc. | 2 / 3 |
Completeness | Describes what it does (improve test coverage using reference test suites) but has no explicit 'Use when...' clause or equivalent trigger guidance. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' itself is also somewhat vague, placing this at 1. | 1 / 3 |
Trigger Term Quality | Includes relevant keywords like 'test coverage', 'shell', 'yash', 'GNU coreutils', 'uutils/coreutils', but misses common user-facing terms like 'unit tests', 'shell scripts', 'bash', 'POSIX', or file extensions. A user asking about shell testing might not use these exact reference suite names. | 2 / 3 |
Distinctiveness Conflict Risk | The combination of shell test coverage with specific reference suites (yash, GNU coreutils, uutils/coreutils) creates a very clear niche that is unlikely to conflict with other skills. This is highly distinctive. | 3 / 3 |
Total | 8 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is extraordinarily thorough and actionable — every step has concrete commands, decision tables, and clear sequencing with validation checkpoints and crash recovery. However, it is severely over-long and monolithic, cramming ~600+ lines of detailed reference material (gap category tables, layer-selection rubrics, YAML format specs, commit templates) into a single file with no progressive disclosure. The verbosity significantly undermines token efficiency, repeating commands and explaining concepts Claude already understands.
Suggestions
Extract the gap category tables (Step 5), layer-selection rubric (Step 6), and YAML format reference (Step 6) into separate bundle files (e.g., GAP_CATEGORIES.md, LAYER_SELECTION.md, SCENARIO_FORMAT.md) and reference them from the main skill.
Remove redundant explanations of basic operations Claude already knows (git add/commit/push workflows, how to run bash commands, what YAML is) — replace with terse command blocks only.
Consolidate repeated bash command patterns (e.g., the 'find tests/scenarios/...' commands appear in Steps 4, 7, 8, and 9) into a single reference section or bundle file.
Trim the security preamble to 2-3 sentences — Claude understands prompt injection; the detailed examples of injection payloads and the <external-data> framing are unnecessary verbosity.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~600+ lines. Massive amounts of repetition (e.g., the same bash/git commands repeated in multiple steps, the gap categories tables, the layer-selection table). Many instructions are procedural scaffolding that Claude could infer (e.g., 'Before starting step N, call TaskList and verify step N-1 is completed'). The security preamble, while important, is also lengthy. The skill explains concepts Claude already knows (how to use git, how to run bash commands, what YAML is). | 1 / 3 |
Actionability | Highly actionable with concrete, executable bash commands, specific file paths, exact YAML formats with examples, precise git commit message templates, and detailed decision tables for every classification step. Every step has copy-paste-ready commands and clear criteria for decision-making. | 3 / 3 |
Workflow Clarity | Exceptionally clear three-phase workflow (Setup → Per-target loop → Finalization) with explicit step numbering, dependency ordering, validation checkpoints (Step 10 runs tests, Step 12 runs CI fixes), feedback loops (fix and re-validate), and a durable progress tracker (COVERAGE_PROGRESS.md) for crash recovery. The resume protocol is well-defined. | 3 / 3 |
Progressive Disclosure | Monolithic wall of text with no bundle files to offload detail into. The gap category tables, layer-selection rubric, YAML format reference, and per-step bash commands could all be split into separate reference files. Everything is inlined into a single massive document with no external references for detailed content like the YAML format spec or the gap analysis rubric. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (789 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
00bdc03
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.