test-quality-reviewer

Evaluate the quality and efficacy of existing tests by reviewing test code against source code. Use when the user asks to review tests, validate test quality, audit test suites, check test efficacy, or assess whether tests are testing things properly. Prioritizes real interactions over mocking and simulation.

Quality

79%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/test-evaluator/skills/test-evaluator/SKILL.md

Quality

Discovery

82%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description that clearly communicates both what the skill does and when to use it, with good trigger term coverage. The main weakness is that the specific capabilities could be more granular—listing concrete evaluation actions like checking assertion quality, identifying missing edge cases, or detecting over-mocking would strengthen specificity. The distinctiveness could also be improved by more clearly differentiating from general code review or test-writing skills.

Suggestions

Add more specific concrete actions like 'identifies missing edge cases, evaluates assertion quality, detects over-mocking, checks coverage gaps' to improve specificity.

Add differentiating language to reduce overlap with code review or test-writing skills, e.g., 'Does not write new tests; focuses on evaluating existing test suites.'

Dimension	Reasoning	Score
Specificity	The description names the domain (test evaluation) and some actions ('reviewing test code against source code'), but doesn't list multiple specific concrete actions like identifying missing edge cases, checking assertion quality, evaluating coverage gaps, or detecting redundant tests.	2 / 3
Completeness	Clearly answers both 'what' (evaluate quality and efficacy of existing tests by reviewing test code against source code) and 'when' (explicit 'Use when' clause listing multiple trigger scenarios). Also includes a philosophical note about prioritizing real interactions over mocking.	3 / 3
Trigger Term Quality	Good coverage of natural terms users would say: 'review tests', 'test quality', 'audit test suites', 'test efficacy', 'testing things properly'. These are phrases users would naturally use when seeking this kind of help.	3 / 3
Distinctiveness Conflict Risk	While focused on test evaluation specifically, it could overlap with general code review skills or test-writing skills. The distinction between 'reviewing tests' and 'writing tests' or 'reviewing code' could cause some ambiguity, though the explicit triggers help somewhat.	2 / 3
	Total	10 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, well-structured skill that provides clear, actionable guidance for evaluating test quality. Its main strengths are the concrete evaluation workflow, well-defined output format with examples, and the practical anti-patterns reference. Its primary weakness is length — the language-specific guidance and philosophy sections add bulk that could be trimmed or externalized, and some testing principles stated are things Claude would already know.

Suggestions

Trim the 'Core Philosophy' section to a brief numbered list without explanatory sentences — Claude already understands these testing concepts.

Extract the 'Language & Framework Guidance' and 'Anti-Patterns Cheat Sheet' sections into separate reference files (e.g., FRAMEWORK_GUIDANCE.md, ANTI_PATTERNS.md) and link to them from the main skill.

Dimension	Reasoning	Score
Conciseness	The skill is generally well-written but has some verbosity. The 'Core Philosophy' section explains testing principles Claude likely already knows. The language/framework guidance section is extensive and could be more condensed. However, the anti-patterns table and evaluation criteria are dense and useful.	2 / 3
Actionability	The skill provides highly concrete, actionable guidance: a specific evaluation workflow with clear steps, a defined output format with example tables, specific verdicts with definitions, and concrete examples of good vs bad patterns per framework. The output format is copy-paste ready with markdown table templates.	3 / 3
Workflow Clarity	The 5-step evaluation workflow is clearly sequenced and logical: read tests → read source → evaluate each test → identify gaps → produce report. Each step has explicit criteria and the output format serves as a natural validation checkpoint ensuring thoroughness. For a review/analysis task (non-destructive), this level of workflow clarity is excellent.	3 / 3
Progressive Disclosure	The content is well-structured with clear sections and headers, but it's a long monolithic file (~180 lines) with no references to external files. The language-specific guidance and anti-patterns cheat sheet could reasonably be split into separate reference files to keep the main skill leaner, though the lack of bundle files means this is the only option currently.	2 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: mattermost/mattermost-ai-marketplace
Commit: 349d5ed

Reviewed: 4 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.