Generate comprehensive, maintainable unit tests across languages with strong coverage and edge case focus.
44
44%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description communicates the core purpose of generating unit tests but lacks explicit trigger guidance ('Use when...'), specific concrete actions, and natural keyword variations that users would employ. It reads more like a tagline than a functional skill description that would help Claude reliably select it from a large skill library.
Suggestions
Add an explicit 'Use when...' clause with trigger scenarios, e.g., 'Use when the user asks to write tests, create unit tests, add test coverage, or generate test files for existing code.'
Include natural keyword variations users would say: 'test file', 'write tests', 'test suite', 'pytest', 'jest', 'JUnit', 'spec', 'TDD', 'mock', '.test.js', '_test.py'.
List more specific concrete actions, e.g., 'Generates test functions with assertions, creates mocks and stubs for dependencies, covers edge cases and error paths, and structures test files following framework conventions.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (unit testing) and some qualities (comprehensive, maintainable, strong coverage, edge case focus), but doesn't list specific concrete actions like 'generate test stubs, mock dependencies, create assertion blocks, produce test reports'. | 2 / 3 |
Completeness | Describes what it does (generate unit tests) but has no explicit 'Use when...' clause or equivalent trigger guidance, which per the rubric should cap completeness at 2, and the 'when' is entirely missing, placing it at 1. | 1 / 3 |
Trigger Term Quality | Includes 'unit tests' and 'coverage' which are natural terms, but misses common variations users would say like 'test file', 'write tests', 'testing', 'test suite', 'TDD', 'pytest', 'jest', 'spec', or specific framework names. | 2 / 3 |
Distinctiveness Conflict Risk | The focus on unit tests is somewhat specific, but 'across languages' is broad and could overlap with general code generation skills or other testing-related skills (integration tests, E2E tests). Lacks distinct triggers to differentiate it clearly. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
27%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is overly verbose and takes the wrong approach—it builds a meta-tool for generating tests rather than teaching Claude how to write good tests directly. Claude already knows how to write pytest, Jest, and React Testing Library tests; the skill should focus on project-specific conventions, edge case strategies, and quality standards rather than reimplementing a test generator. The lack of validation steps (running generated tests, verifying they pass) and the monolithic structure further weaken its utility.
Suggestions
Replace the meta-tool approach (TestGenerator class) with concise patterns showing what good tests look like—specific examples of edge cases, assertion quality, and test organization conventions Claude should follow.
Add a validation workflow: after generating tests, run them, verify they pass, check coverage, and iterate on failures before presenting results.
Split language-specific test patterns into separate referenced files (e.g., PYTEST_PATTERNS.md, JEST_PATTERNS.md) and keep the main skill as a concise overview with navigation.
Remove explanations of concepts Claude already knows (what mocks are, how pytest fixtures work) and focus on project-specific conventions or non-obvious patterns.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with large code blocks that are essentially scaffolding/boilerplate Claude already knows how to produce. The skill explains concepts like what mocks are, includes full class implementations for test generation that Claude doesn't need, and the 'Context' and 'Use this skill when' sections add little value. Most of this content is things Claude inherently knows how to do. | 1 / 3 |
Actionability | The code examples are somewhat concrete but they're meta-code (code that generates tests) rather than directly actionable test patterns. The TestGenerator class is a framework for generating tests programmatically, but a skill should teach Claude how to write tests directly, not build a test generation tool. The generated test examples are generic with placeholder assertions like 'assert result is not None'. | 2 / 3 |
Workflow Clarity | Steps are numbered and sequenced (analyze → generate → coverage → mocks), but there are no validation checkpoints. There's no feedback loop for verifying generated tests actually pass, no step to run the tests and fix failures, and the coverage analysis section doesn't connect back to iterating on test quality. For a test generation workflow, missing the 'run tests and verify they pass' validation step is a significant gap. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of code with no references to external files. All language-specific test generation (Python, JavaScript, TypeScript, React, coverage analysis, mocking) is crammed into one file. The content would benefit enormously from splitting language-specific patterns into separate reference files with a concise overview in the main skill. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents