test-quality-analysis

Detect test smells, overmocking, flaky tests, and coverage issues. Analyze test effectiveness, maintainability, and reliability. Use when reviewing tests or improving test quality.

Quality

82%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No known issues

Quality

Content

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A well-organized, highly actionable reference with concrete code examples. It loses points on conciseness (redundant sections and obvious-concept glosses) and workflow clarity (no explicit validation-gated sequence).

Suggestions

Remove the 'Core Dimensions' gloss (concepts Claude already knows) or compress it to a one-line reference, and merge the duplicate 'Common Anti-Patterns' overmocking/implementation-detail material into 'Test Smells' to cut redundancy.

Add an explicit review workflow with checkpoints, e.g. '1. Run coverage 2. Identify smells 3. Flag flaky tests 4. Re-run to confirm green' to give the analysis a validation-gated sequence.

Dimension	Reasoning	Score
Conciseness	The body is mostly lean BAD/GOOD code pairs, but 'Core Dimensions' glosses concepts Claude already knows ('Tests verify the right behavior') and 'Common Anti-Patterns' duplicates the Overmocking/implementation-detail material from 'Test Smells'; not 3 due to this redundancy, not 1 because it avoids library tutorials and padding.	2 / 3
Actionability	Provides fully executable TypeScript BAD/GOOD examples and concrete commands ('bun test --coverage --coverage.thresholds.lines=80', 'uv run pytest --cov --cov-report=html') plus copy-paste checklists; not 2 because the code is real and executable rather than pseudocode.	3 / 3
Workflow Clarity	Content is reference-style with an implicit Problem/Detection/Fix pattern and checklists, but no explicit numbered workflow with validation checkpoints; not 1 because structure implies a sequence, not 3 because there is no explicit validate-then-retry feedback loop.	2 / 3
Progressive Disclosure	A single self-contained file with well-organized sections (Test Smells, Analysis Tools, Best Practices, Code Review Checklist, See Also) and a flat See-Also pointer to sibling skills; no nested references or monolithic wall, so navigation is easy.	3 / 3
	Total	10 / 12 Passed

Description

92%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong description that is specific, action-oriented, and includes an explicit 'Use when' trigger clause. Its main weakness is overlap risk with sibling testing skills referenced in the body.

Suggestions

Sharpen the trigger to distinguish from sibling testing skills, e.g. 'Use when reviewing existing tests for quality issues (smells, flakiness, overmocking), not when writing new tests.'

Add a common user phrasing like 'test coverage' or 'unit tests' alongside 'coverage issues' to broaden natural trigger terms.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions ('Detect test smells, overmocking, flaky tests, and coverage issues', 'Analyze test effectiveness, maintainability, and reliability') matching the comprehensive-action anchor; above 2 because it enumerates specific actions rather than naming only a domain.	3 / 3
Completeness	Explicitly answers both what ('Detect test smells...') and when ('Use when reviewing tests or improving test quality'), satisfying the explicit-trigger anchor; not 2 because the 'Use when' clause is present.	3 / 3
Trigger Term Quality	Includes natural user terms like 'flaky tests', 'overmocking', 'coverage issues', 'reviewing tests', and 'improving test quality'; solidly above 2's 'missing common variations' bar though slightly less exhaustive than the ideal example.	3 / 3
Distinctiveness Conflict Risk	The test-quality framing ('test smells, overmocking, flaky tests') is a distinct niche, but the body's 'See Also' names sibling testing skills (mutation-testing, vitest-testing) with overlapping triggers, risking conflicts; not 3 due to that overlap, not 1 because it is not generic.	2 / 3
	Total	11 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning

	Total	15 / 16 Passed

Repository: secondsky/claude-skills
Commit: c4889f6

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.