refine-tests

Critically review and refine tests generated during a coding session. Use this skill when the user asks to review tests, clean up tests, refine tests, prune tests, improve test quality, or remove useless/low-value tests. Also use when the user says 'review my tests', 'clean up tests', or mentions test quality after a coding session.

Quality

83%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that excels in completeness and trigger term coverage. It clearly defines both what the skill does and when to use it, with abundant natural language triggers. The main area for improvement is in specificity—the concrete actions could be more detailed to describe exactly what 'refine' and 'review' entail technically.

Dimension	Reasoning	Score
Specificity	The description names the domain (test review/refinement) and mentions several actions like 'review', 'clean up', 'refine', 'prune', 'remove useless/low-value tests', but these are somewhat overlapping and don't describe deeply concrete distinct actions (e.g., it doesn't specify what 'refine' entails technically, like checking assertions, removing duplicates, improving coverage).	2 / 3
Completeness	Clearly answers both 'what' (critically review and refine tests generated during a coding session) and 'when' (explicit 'Use this skill when...' clause with multiple trigger scenarios). The 'Use when' guidance is thorough and explicit.	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms users would say: 'review tests', 'clean up tests', 'refine tests', 'prune tests', 'improve test quality', 'remove useless/low-value tests', 'review my tests', 'test quality after a coding session'. These are highly natural phrases a user would actually use.	3 / 3
Distinctiveness Conflict Risk	The skill has a clear niche: post-coding-session test review and refinement. The specific focus on reviewing/pruning existing tests (rather than writing new tests or running tests) makes it distinct and unlikely to conflict with general testing or code review skills.	3 / 3
	Total	11 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted skill with a clear 5-step workflow, strong actionability through specific criteria and examples, and proper validation steps. Its main weakness is moderate verbosity — some explanatory sentences within pruning categories and refinement guidelines could be tightened without losing clarity. The progressive disclosure is adequate for the content length but could benefit from separating detailed criteria into reference files.

Suggestions

Tighten the pruning category descriptions by removing the explanatory second sentences (e.g., 'If the test would pass even with a broken implementation, it's not testing anything useful' — Claude can infer this from the category definition).

Consider extracting the detailed pruning criteria and refinement checklist into a referenced file (e.g., TEST_REVIEW_CRITERIA.md) to keep SKILL.md as a lean overview.

Dimension	Reasoning	Score
Conciseness	The content is mostly efficient and well-structured, but some explanations are slightly verbose — e.g., the extended descriptions of each pruning category include explanatory sentences that Claude could infer. The examples within categories (like the 10,000-character input example) add clarity but also add length.	2 / 3
Actionability	The skill provides concrete, specific criteria for pruning (with clear examples of what constitutes trivial/weak/brittle tests), specific assertion guidance (e.g., `assertEquals` over `assertTrue`, `assertDatabaseHas`), naming conventions, and an executable test command. The guidance is directly actionable.	3 / 3
Workflow Clarity	The 5-step workflow is clearly sequenced: identify → prune → refine → fill gaps → run tests. Step 5 includes an explicit validation checkpoint with a feedback loop ('If any test fails, fix it'), and the output summary step provides a verification checklist.	3 / 3
Progressive Disclosure	The content is well-organized with clear headers and sub-sections, but it's a moderately long monolithic file. The pruning categories and refinement sub-sections could potentially be split into referenced files for a cleaner overview, though the current length is borderline acceptable.	2 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: figlabhq/agent-skills
Commit: 533a35f

Reviewed: 24 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.