Test strategy, execution, and coverage analysis. Use when designing tests, running test suites, or analyzing test results beyond baseline checks.
77
Quality
72%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./core/agent-ops-testing/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has good structure with explicit 'Use when' guidance and covers the core testing domain. However, it lacks specific concrete actions and comprehensive trigger terms that would help Claude confidently select this skill over others. The phrase 'beyond baseline checks' is vague and doesn't clearly define the skill's unique scope.
Suggestions
Add more specific concrete actions like 'generate unit/integration tests, measure code coverage percentages, identify untested code paths, create test fixtures'
Include natural trigger term variations users would say: 'unit tests', 'integration tests', 'TDD', 'test coverage', 'pytest', 'jest', 'mocking'
Clarify what 'beyond baseline checks' means - specify the threshold or types of advanced testing this skill handles versus basic testing
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (testing) and some actions ('designing tests, running test suites, analyzing test results'), but lacks concrete specifics like test types, frameworks, or detailed capabilities such as 'generate unit tests, measure code coverage, identify untested paths'. | 2 / 3 |
Completeness | Clearly answers both what ('Test strategy, execution, and coverage analysis') and when ('Use when designing tests, running test suites, or analyzing test results beyond baseline checks') with explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Includes relevant terms like 'tests', 'test suites', 'test results', and 'coverage analysis', but misses common variations users might say such as 'unit tests', 'integration tests', 'TDD', 'test coverage', 'pytest', 'jest', or 'testing framework'. | 2 / 3 |
Distinctiveness Conflict Risk | The phrase 'beyond baseline checks' attempts to differentiate from basic testing, but 'test strategy' and 'execution' are broad enough to potentially overlap with general coding skills or CI/CD skills. The niche could be clearer. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a comprehensive testing skill with strong actionability and workflow clarity. The concrete code examples, validation checkpoints, and decision tables make it highly usable. However, it's somewhat verbose for a skill file and could be more concise by removing explanations of testing concepts Claude already knows and splitting detailed reference content into separate files.
Suggestions
Remove or significantly condense the 'Good Tests' and 'Anti-Patterns to Avoid' sections - Claude already knows these testing fundamentals
Split detailed content like test isolation patterns and coverage analysis into separate reference files (e.g., TEST_ISOLATION.md, COVERAGE.md) and link from the main skill
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some redundant content like explaining test anti-patterns Claude already knows, and the issue tracking sections have overlapping information between file-based and CLI approaches that could be consolidated. | 2 / 3 |
Actionability | Provides concrete, executable code examples throughout including pytest commands, forbidden/allowed patterns with real Python code, and specific bash commands. The coverage thresholds table and test isolation examples are copy-paste ready. | 3 / 3 |
Workflow Clarity | Multi-step processes are clearly sequenced with explicit validation checkpoints. The incremental testing workflow has clear stop-diagnose-fix steps, failure investigation has categorization tables with actions, and coverage analysis includes enforcement blocks with pass/fail criteria. | 3 / 3 |
Progressive Disclosure | Content is well-organized with clear sections, but it's a long monolithic document that could benefit from splitting detailed sections (like test isolation patterns, coverage analysis) into separate reference files. References to other skills like 'agent-ops-debugging' and 'agent-ops-tasks' are appropriately signaled. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
2bbaa03
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.