Post-execution test review and fix - chain from workflow-lite-execute or standalone. Reviews implementation against plan, runs tests, auto-fixes failures.
56
63%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Risky
Do not use without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/workflow-lite-test-review/SKILL.mdQuality
Discovery
50%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description conveys a reasonable sense of what the skill does—reviewing implementations, running tests, and fixing failures—but lacks an explicit 'Use when...' clause and natural trigger terms users would employ. The reference to 'workflow-lite-execute' is internal jargon that doesn't help with user-facing skill selection, and the actions could be more concrete.
Suggestions
Add an explicit 'Use when...' clause with natural trigger phrases such as 'Use when the user asks to run tests, review test results, fix failing tests, or debug test failures after code implementation.'
Include more natural user-facing trigger terms like 'failing tests', 'test errors', 'debug tests', 'fix broken tests', 'test suite', 'test results'.
Make the actions more concrete—specify what kinds of tests (unit, integration), what 'reviewing against plan' means in practice, and what types of auto-fixes are applied.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names some actions ('reviews implementation against plan', 'runs tests', 'auto-fixes failures') but they are somewhat generic and not deeply specific about what kinds of tests, what kinds of fixes, or what 'reviewing against plan' entails concretely. | 2 / 3 |
Completeness | The 'what' is partially addressed (reviews, runs tests, auto-fixes), but there is no explicit 'Use when...' clause. The phrase 'chain from workflow-lite-execute or standalone' hints at when but doesn't clearly articulate trigger conditions for Claude to select this skill. | 2 / 3 |
Trigger Term Quality | Includes some relevant terms like 'test', 'fix', 'failures', and 'post-execution', but misses common natural user phrases like 'run tests', 'debug', 'test failures', 'failing tests', 'fix broken tests'. The term 'workflow-lite-execute' is internal jargon unlikely to be used by users. | 2 / 3 |
Distinctiveness Conflict Risk | The combination of 'post-execution test review' and 'chain from workflow-lite-execute' provides some distinctiveness, but terms like 'runs tests' and 'auto-fixes failures' could overlap with general testing or debugging skills. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable workflow skill with clear phase sequencing, explicit validation checkpoints, and concrete executable code throughout. Its main weakness is that it's quite long for a single file — the data structures and detailed phase implementations could benefit from being split into supporting reference files. The content is mostly efficient but has some areas where tightening would improve token economy.
Suggestions
Extract the Data Structures section (testReviewContext, testChecklist schemas) into a separate REFERENCE.md file and link to it, reducing the main skill's token footprint.
Tighten the Input Modes section — the prose explanation and the code block are somewhat redundant; the code block alone with brief inline comments would suffice.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long (~250 lines) but most content is structural (phase definitions, data schemas, code blocks). Some sections like the Mode 1/Mode 2 explanation could be tighter, and the data structures section is verbose but arguably necessary for a complex workflow. It doesn't over-explain basic concepts but could be more compact. | 2 / 3 |
Actionability | Provides fully executable JavaScript code for session resolution, agent delegation, CLI invocation, and test framework detection. Commands are concrete (e.g., `npm test`, `python -m pytest -v --tb=short`), data structures have complete schemas, and the auto-fix agent prompt is copy-paste ready. | 3 / 3 |
Workflow Clarity | Five phases are clearly sequenced with explicit TodoWrite checkpoints between each phase. Includes validation/feedback loops (Phase 4 iterative fix with max 3 rounds, re-run and break on pass), skip conditions are clearly stated, and Phase 5 has a mandatory checkpoint annotation. Error handling table covers failure modes with resolutions. | 3 / 3 |
Progressive Disclosure | The skill is monolithic — all content is inline in a single file with no references to supporting documents. The data structures and session folder structure sections could be split into separate reference files. However, the phase summary table at the top provides a good overview, and sections are well-organized with clear headers. No bundle files are provided to offload content to. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
Total | 10 / 11 Passed | |
5ff5e86
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.