Validate test effectiveness with mutation testing using Stryker (TypeScript/JavaScript) and mutmut (Python). Find weak tests that pass despite code mutations. Use to improve test quality.
Install with Tessl CLI
npx tessl i github:secondsky/claude-skills --skill mutation-testing86
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description has strong specificity with concrete tools and actions, and is highly distinctive in its niche. However, it falls short on completeness by lacking an explicit 'Use when...' clause with clear trigger scenarios, and could benefit from additional natural trigger terms users might employ when seeking mutation testing help.
Suggestions
Add an explicit 'Use when...' clause with trigger scenarios, e.g., 'Use when the user asks about mutation testing, wants to evaluate test suite strength, or mentions Stryker or mutmut'.
Include additional natural trigger terms like 'test coverage gaps', 'mutation score', 'test suite quality', or 'surviving mutants' that users might naturally say.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists specific concrete actions: 'Validate test effectiveness', 'mutation testing', names specific tools (Stryker, mutmut), specifies languages (TypeScript/JavaScript, Python), and describes the outcome 'Find weak tests that pass despite code mutations'. | 3 / 3 |
Completeness | Clearly answers 'what' (validate test effectiveness with mutation testing tools) but the 'when' clause 'Use to improve test quality' is weak and implied rather than explicit with trigger scenarios like 'Use when the user asks about mutation testing, wants to evaluate test suite quality, or mentions Stryker/mutmut'. | 2 / 3 |
Trigger Term Quality | Includes relevant terms like 'mutation testing', 'Stryker', 'mutmut', 'test quality', 'weak tests', but missing common variations users might say like 'test coverage', 'test effectiveness', 'kill mutants', or 'mutation score'. | 2 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with specific tool names (Stryker, mutmut) and the niche domain of mutation testing. Unlikely to conflict with general testing skills or code quality tools. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill file that efficiently covers mutation testing for two ecosystems with concrete, executable examples. The structure progresses logically from concepts to implementation to workflow, with clear before/after examples demonstrating weak vs strong tests. The content respects Claude's intelligence while providing all necessary specifics for immediate use.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient, avoiding explanations of concepts Claude already knows. Each section provides direct, actionable information without padding or unnecessary context about what mutation testing is philosophically. | 3 / 3 |
Actionability | Provides fully executable code examples for both TypeScript/JavaScript and Python ecosystems. Installation commands, configuration files, and running commands are all copy-paste ready with concrete examples showing weak vs strong tests. | 3 / 3 |
Workflow Clarity | The workflow section at the end provides a clear numbered sequence with explicit checkpoints (ensure coverage first, run mutation testing, check report, fix, re-run incrementally). The iterative improvement loop is well-defined. | 3 / 3 |
Progressive Disclosure | Content is well-organized with clear sections progressing from core concepts to tool-specific instructions to best practices. References to related skills (vitest-testing, test-quality-analysis) are clearly signaled at the end without deep nesting. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
68%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 16 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
description_trigger_hint | Description may be missing an explicit 'when to use' trigger hint (e.g., 'Use when...') | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
metadata_version | 'metadata' field is not a dictionary | Warning |
license_field | 'license' field is missing | Warning |
body_steps | No step-by-step structure detected (no ordered list); consider adding a simple workflow | Warning |
Total | 11 / 16 Passed | |
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.