mutation-testing

Validate test effectiveness with mutation testing using Stryker (TypeScript/JavaScript with Vitest or bun test via @hughescr/stryker-bun-runner) and mutmut (Python). Find weak tests that pass despite code mutations. Use to improve test quality.

1.43x

Quality

73%

Does it follow best practices?

Impact

93%

1.43x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/mutation-testing/skills/mutation-testing/SKILL.md

Quality

Content

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill covering mutation testing across TypeScript/JavaScript and Python ecosystems with concrete, executable examples and configurations. Its main weaknesses are moderate verbosity (explanatory content Claude doesn't need, like mutation type definitions and score target tables) and a workflow section that lacks explicit validation/error-recovery checkpoints. The content would benefit from trimming generic knowledge and adding error handling guidance.

Suggestions

Remove or significantly trim the 'Core Concept', 'Common Mutation Types', and 'Mutation Score Targets' sections — Claude already understands these concepts and they consume tokens without adding actionable value.

Add explicit validation/error-recovery steps to the workflow, e.g., what to do when Stryker reports errors, how to handle Bun version incompatibility, or how to triage a large number of survived mutants.

Consider extracting the detailed Bun runner configuration and key behaviors into a separate reference file to improve progressive disclosure and keep the main skill leaner.

Dimension	Reasoning	Score
Conciseness	The skill is mostly efficient but includes some unnecessary content like the 'Core Concept' definitions (Claude knows what mutation testing is), the mutation types table (basic knowledge), and the score targets table which is generic guidance. The Bun runner section's 'Key Behaviors' explanation is somewhat verbose for Claude's level.	2 / 3
Actionability	Provides fully executable installation commands, complete configuration files, runnable CLI commands, and concrete before/after code examples showing weak vs strong tests. Everything is copy-paste ready for both Stryker (Vitest and Bun runners) and mutmut.	3 / 3
Workflow Clarity	The workflow section at the end provides a clear sequence (coverage → mutation testing → check report → fix → re-run), but lacks explicit validation checkpoints or error recovery steps. For instance, there's no guidance on what to do if Stryker fails, if the Bun version is too old, or how to interpret and act on specific report findings systematically.	2 / 3
Progressive Disclosure	The content is well-structured with clear headers and sections, and references 'vitest-testing' and 'test-quality-analysis' skills at the end. However, at ~170 lines it includes substantial inline content (common mutation types, score targets, multiple improvement patterns) that could be split into reference files. No bundle files exist to offload detail.	2 / 3
	Total	9 / 12 Passed

Description

82%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description with excellent specificity and distinctiveness, naming concrete tools and actions in a clear niche. The main weakness is the lack of an explicit 'Use when...' clause with trigger scenarios, which would help Claude know exactly when to select this skill. The description uses appropriate third-person voice throughout.

Suggestions

Replace 'Use to improve test quality' with an explicit 'Use when...' clause, e.g., 'Use when the user asks about mutation testing, wants to evaluate test suite effectiveness, mentions Stryker or mutmut, or wants to find weak or insufficient tests.'

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions: validate test effectiveness, mutation testing, find weak tests that pass despite code mutations, improve test quality. Names specific tools (Stryker, mutmut) and frameworks (Vitest, bun test, @hughescr/stryker-bun-runner).	3 / 3
Completeness	Clearly answers 'what' (validate test effectiveness with mutation testing, find weak tests) but the 'when' is only implied with 'Use to improve test quality' rather than an explicit 'Use when...' clause with trigger scenarios. Per rubric guidelines, missing explicit 'Use when...' clause caps completeness at 2.	2 / 3
Trigger Term Quality	Includes strong natural keywords users would say: 'mutation testing', 'Stryker', 'mutmut', 'test quality', 'weak tests', 'TypeScript', 'JavaScript', 'Python', 'Vitest', 'bun test'. Good coverage of both tool names and conceptual terms.	3 / 3
Distinctiveness Conflict Risk	Mutation testing is a very specific niche. The mention of specific tools (Stryker, mutmut) and the concept of mutation testing clearly distinguishes this from general testing skills or code quality skills. Unlikely to conflict with other skills.	3 / 3
	Total	11 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning

	Total	10 / 11 Passed

Repository: secondsky/claude-skills
Commit: 5e92b71

Reviewed: 29 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.