Design test strategy using Beck's Test Desiderata — which properties matter, which tradeoffs to make. Use when the user asks "how should I test this", "what tests do I need", "review my test strategy", "is this well-tested", or when planning tests for a new feature or refactor.
90
87%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that excels at trigger term quality and completeness, with a clear 'Use when' clause containing multiple natural user phrases. The specificity of the Beck's Test Desiderata framework creates good distinctiveness. The main weakness is that the 'what' portion could be more concrete about the specific actions or outputs the skill produces beyond the somewhat abstract 'which properties matter, which tradeoffs to make'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (test strategy) and references a specific framework (Beck's Test Desiderata) with some actions ('which properties matter, which tradeoffs to make'), but doesn't list multiple concrete actions like 'evaluate coverage gaps, prioritize test types, recommend test patterns'. | 2 / 3 |
Completeness | Clearly answers both what ('Design test strategy using Beck's Test Desiderata — which properties matter, which tradeoffs to make') and when ('Use when the user asks...' with explicit trigger phrases and situational triggers). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural phrases users would say: 'how should I test this', 'what tests do I need', 'review my test strategy', 'is this well-tested', plus contextual triggers like 'planning tests for a new feature or refactor'. | 3 / 3 |
Distinctiveness Conflict Risk | The specific reference to Beck's Test Desiderata and focus on test strategy/tradeoffs creates a clear niche distinct from general testing skills, code review skills, or test writing/execution skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong skill that provides a well-structured framework for test strategy design. Its main strengths are the actionable templates, clear workflow, and good use of progressive disclosure with references. The main weakness is moderate verbosity in areas where Claude already has strong knowledge (testing pyramid, common test smells, basic boundary categories), though the Desiderata framework itself adds genuine novel value.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient but includes some content Claude already knows (e.g., the testing trophy diagram, basic boundary testing categories like empty/zero/null). The tables and templates are well-structured but some sections like 'Test Smells' cover well-known concepts that could be trimmed. | 2 / 3 |
Actionability | Provides concrete strategy templates with checklists for pure functions, API endpoints, and UI components. The output format template gives a specific, copy-paste-ready structure for test strategy deliverables. The workflow steps are specific and directive rather than vague. | 3 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced (articulate contract → identify boundaries → choose approach → apply trophy → evaluate existing). Each step has explicit criteria and the 'Confidence Question' serves as a final validation checkpoint. For a strategy/thinking skill (not a destructive operation), this level of workflow clarity is appropriate. | 3 / 3 |
Progressive Disclosure | Well-organized with clear sections, a reference to `references/desiderata.md` for deeper guidance, and a 'See Also' section linking to related skills. Content is appropriately split between overview tables, workflow steps, templates, and references without being monolithic or deeply nested. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
c3b1fc2
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.