Automatically generate regression tests for Java codebases by analyzing changes between old and new code versions. Use when users need to: (1) Generate tests after refactoring or code changes, (2) Ensure previously tested behavior still works in new versions, (3) Cover modified or newly added code paths, (4) Migrate existing tests to work with updated APIs or signatures, (5) Maintain test coverage during code evolution. Supports JUnit and TestNG frameworks with unit tests, parameterized tests, and exception testing patterns.
85
86%
Does it follow best practices?
Impact
76%
1.00xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that hits all the marks. It provides specific concrete actions, comprehensive trigger scenarios with an explicit 'Use when' clause, natural keywords developers would use, and a clearly defined niche (Java regression testing with version comparison). The description uses proper third-person voice throughout and balances detail with clarity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'generate regression tests', 'analyzing changes between old and new code versions', 'migrate existing tests', 'cover modified or newly added code paths'. Also specifies frameworks (JUnit, TestNG) and test patterns (unit tests, parameterized tests, exception testing). | 3 / 3 |
Completeness | Clearly answers both what ('generate regression tests for Java codebases by analyzing changes') and when with explicit 'Use when users need to:' clause followed by five specific trigger scenarios. The numbered list provides comprehensive guidance on when to select this skill. | 3 / 3 |
Trigger Term Quality | Includes natural keywords users would say: 'regression tests', 'Java', 'refactoring', 'code changes', 'test coverage', 'JUnit', 'TestNG', 'APIs', 'signatures'. Good coverage of terms developers naturally use when discussing testing after code modifications. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche focusing specifically on regression testing for Java with version comparison. The combination of 'regression tests', 'old and new code versions', and Java-specific frameworks (JUnit/TestNG) creates a distinct trigger profile unlikely to conflict with general testing or other language skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
72%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured skill with excellent actionability through comprehensive, executable Java test examples. The main weaknesses are moderate verbosity (some redundant explanations and generic tips) and missing explicit validation steps in the workflow for verifying generated tests compile and run correctly. The progressive disclosure is handled well with clear references to detailed pattern files.
Suggestions
Add an explicit validation step in the workflow (Step 6) that instructs Claude to verify generated tests compile and run against the new code before presenting them
Remove or condense the 'Tips' section - items like 'Analyze changes carefully' and 'Maintain readability' are generic advice Claude already knows
Trim the Overview section which restates the skill's purpose multiple times; the 'How to Use' section already covers this
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is moderately efficient but includes some unnecessary explanation. The overview section restates what the skill does multiple times, and some sections like 'Tips' contain generic advice Claude already knows (e.g., 'Analyze changes carefully', 'Maintain readability'). | 2 / 3 |
Actionability | Excellent actionability with fully executable Java code examples throughout. Each example shows complete, copy-paste ready test code with proper imports, annotations, and assertions. The before/after code comparisons with generated tests are concrete and immediately usable. | 3 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced, but lacks explicit validation checkpoints. There's no verification step to ensure generated tests compile or pass, and no feedback loop for handling test generation failures. Step 7 in Tips mentions 'Verify compilation' but this should be an explicit workflow step. | 2 / 3 |
Progressive Disclosure | Good structure with clear overview, detailed examples inline, and appropriate references to external files (change_patterns.md, test_patterns.md) for deeper content. The 'Load these references when' section clearly signals when to access additional materials. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (554 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
0f00a4f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.