Run unit tests, integration tests, or slow integration tests matching CI. Use to validate changes before submitting a PR.
88
82%
Does it follow best practices?
Impact
100%
1.13xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
77%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a solid description that clearly communicates what the skill does and when to use it. It names specific test types and provides a clear use-case trigger. Minor improvements could include additional natural trigger terms and slightly more distinctive language to reduce potential overlap with other pre-PR validation skills.
Suggestions
Add more natural trigger term variations such as 'test suite', 'run tests', 'pytest', or 'make test' to improve discoverability.
Strengthen distinctiveness by clarifying what differentiates this from other validation/CI skills, e.g., 'Use when the user asks to run tests, execute test suites, or check test results before submitting a PR.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Run unit tests, integration tests, or slow integration tests matching CI.' These are distinct, well-defined testing categories. | 3 / 3 |
Completeness | Clearly answers both what ('Run unit tests, integration tests, or slow integration tests matching CI') and when ('Use to validate changes before submitting a PR'), providing an explicit trigger scenario. | 3 / 3 |
Trigger Term Quality | Includes relevant terms like 'unit tests', 'integration tests', 'CI', and 'PR', which users would naturally say. However, it misses common variations like 'test suite', 'run tests', 'check tests', 'pytest', or 'test runner' that users might use. | 2 / 3 |
Distinctiveness Conflict Risk | The testing focus is fairly specific, but 'validate changes before submitting a PR' could overlap with linting, code review, or other pre-PR validation skills. The mention of specific test types (unit, integration, slow integration) helps but doesn't fully eliminate overlap risk. | 2 / 3 |
Total | 10 / 12 Passed |
Implementation
87%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, well-structured skill that provides exact, CI-matching test commands with useful context about JDK versions, OS matrices, and test conventions. The main weakness is the lack of a recommended pre-PR workflow sequence tying the commands together with validation checkpoints, which would better serve the stated purpose of validating changes before submitting a PR.
Suggestions
Add a brief recommended workflow at the top (e.g., '1. Run unit tests → 2. Run integration tests if touching integration code → 3. Verify all pass before PR') to connect the individual commands into a coherent pre-PR validation sequence.
Include guidance on interpreting test failures or common failure patterns, especially given the CI retry behavior that can mask flaky tests.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Every section earns its place — exact commands, CI matrix details Claude wouldn't know, and concise tables for conventions. No unnecessary explanations of what Maven or JUnit are. | 3 / 3 |
Actionability | All commands are fully executable and copy-paste ready with exact flags matching CI. The module-specific test pattern with placeholder is clear and actionable. | 3 / 3 |
Workflow Clarity | Commands are clearly organized by test type with good CI context, but there's no validation/verification guidance — e.g., what to check after running tests, how to interpret failures, or a recommended sequence before submitting a PR. For a skill described as 'validate changes before submitting a PR,' a workflow connecting these steps would strengthen it. | 2 / 3 |
Progressive Disclosure | Content is well-organized with clear sections and a reference to the CI workflow file. For a skill of this size (~70 lines) with no bundle files, the structure is appropriate — tables and headers make navigation easy without needing external files. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
bf0fe4b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.