Validate test effectiveness with mutation testing using Stryker (TypeScript/JavaScript with Vitest or bun test via @hughescr/stryker-bun-runner) and mutmut (Python). Find weak tests that pass despite code mutations. Use to improve test quality.
89
87%
Does it follow best practices?
Impact
93%
1.43xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description with excellent specificity and trigger term coverage for the mutation testing domain. It clearly identifies the tools, languages, and purpose. The main weakness is the implicit 'when' clause - 'Use to improve test quality' doesn't provide explicit trigger scenarios that would help Claude know when to select this skill.
Suggestions
Add an explicit 'Use when...' clause with trigger scenarios, e.g., 'Use when the user asks about mutation testing, wants to evaluate test suite effectiveness, or mentions Stryker or mutmut.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Validate test effectiveness', 'mutation testing', specific tools (Stryker, mutmut), specific frameworks (Vitest, bun test), and the outcome 'Find weak tests that pass despite code mutations'. | 3 / 3 |
Completeness | Clearly answers 'what' (mutation testing with specific tools to find weak tests) but the 'when' is only implied with 'Use to improve test quality' - lacks explicit trigger guidance like 'Use when the user asks about...' or specific scenarios. | 2 / 3 |
Trigger Term Quality | Includes natural keywords users would say: 'mutation testing', 'Stryker', 'mutmut', 'TypeScript', 'JavaScript', 'Python', 'Vitest', 'bun test', 'test quality', 'weak tests'. Good coverage of tool names and language-specific terms. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with specific niche: mutation testing is a specialized domain, and naming specific tools (Stryker, mutmut) and frameworks creates clear boundaries that won't conflict with general testing or code quality skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
92%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a high-quality skill that provides comprehensive, actionable guidance for mutation testing across multiple languages and tools. The content is well-structured with executable examples, clear workflows, and practical patterns for improving weak tests. Minor improvement could be made by splitting detailed tool configurations into separate reference files for better progressive disclosure.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is lean and efficient, providing only necessary information without explaining concepts Claude already knows. Every section delivers actionable content without padding or unnecessary context. | 3 / 3 |
Actionability | Provides fully executable code examples for installation, configuration, and running commands. The before/after test examples are copy-paste ready and demonstrate concrete improvements. | 3 / 3 |
Workflow Clarity | Clear numbered workflow at the end with explicit steps: ensure coverage → run mutation testing → check report → fix survived mutants → re-run incrementally. The sequence is logical with clear checkpoints. | 3 / 3 |
Progressive Disclosure | Content is well-organized with clear sections, but the skill is fairly long (~200 lines) and could benefit from splitting detailed configuration examples into separate reference files. The 'See Also' section provides good cross-references. | 2 / 3 |
Total | 11 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
Total | 10 / 11 Passed | |
90d6bd7
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.