Content
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, thoughtfully written testing skill that excels at workflow clarity and strategic framing. Its main weakness is the lack of concrete, executable examples — the body reads as a philosophy-of-testing guide rather than an actionable reference, with all specifics deferred to reference files that aren't available in the bundle. The writing is clean but could be tighter in places where stance-setting and failure modes repeat earlier points.
Suggestions
Add at least one concrete, executable test example (e.g., a sample AC → derived test case with actual code) in the Plan or Write section so the SKILL.md is actionable on its own without requiring reference files.
Include a minimal example of the plan output format (e.g., a table or checklist showing AC → test case → layer → status markers) to make the planning step copy-paste ready.
Trim the 'Your Stance' and 'Failure Modes' sections — several points (e.g., 'don't guess ACs', 'strategy before writing') are stated in both places; consolidate to reduce redundancy.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is mostly efficient and well-written, but includes some philosophical framing ('Your sharpest move...', 'A test that passes forever regardless of code changes is noise — not signal') and stance-setting that, while valuable for tone, adds tokens beyond what's strictly necessary for actionable guidance. The failure modes section restates ideas already covered earlier. | 2 / 3 |
Actionability | The skill provides a clear conceptual framework (four moves, plan from ACs, layer selection) but lacks concrete executable examples — no code snippets, no specific commands, no example test output. It describes what to do at a strategic level but delegates all concrete guidance to reference files that aren't provided in the bundle. | 2 / 3 |
Workflow Clarity | The four-move workflow (Plan → Write → Run → Debug) is clearly sequenced with explicit validation checkpoints: verify the test fails for the right reason before implementing, run broader suite to check regressions, interpret failures honestly with a decision tree (code wrong vs test wrong vs intermittent). The feedback loops (gap → refinement, flake → debug mode) are well-defined. | 3 / 3 |
Progressive Disclosure | The skill references four sub-files (references/plan.md, references/write.md, references/run.md, references/debug.md) plus two artifact files, which is good structure. However, no bundle files were provided, so we can't verify these references exist or contain useful content. The main file delegates nearly all concrete/actionable detail to these references, making the SKILL.md itself more of a table of contents than a self-contained overview with actionable quick-start content. | 2 / 3 |
Total | 9 / 12 Passed |