test-quality-reviewer

Evaluate the quality and efficacy of existing tests by reviewing test code against source code. Use when the user asks to review tests, validate test quality, audit test suites, check test efficacy, or assess whether tests are testing things properly. Prioritizes real interactions over mocking and simulation.

Quality

79%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./plugins/test-evaluator/skills/test-evaluator/SKILL.md

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, actionable skill that provides clear methodology for evaluating test quality. Its main strength is the concrete evaluation criteria, well-defined output format with examples, and framework-specific guidance. Its primary weakness is length — the core philosophy section restates testing principles Claude already knows, and the framework-specific sections could be externalized for better progressive disclosure.

Suggestions

Trim the 'Core Philosophy' section to a brief bullet list of priorities rather than explaining why each principle matters — Claude already understands testing philosophy.

Extract the 'Language & Framework Guidance' and 'Anti-Patterns Cheat Sheet' sections into separate reference files (e.g., FRAMEWORK_GUIDANCE.md, ANTI_PATTERNS.md) and link to them from the main skill.

Dimension	Reasoning	Score
Conciseness	The skill is generally well-written but has some verbosity. The 'Core Philosophy' section explains testing principles Claude likely already knows. The language/framework guidance section is extensive and could be more concise. However, the anti-patterns table and evaluation criteria are dense and useful.	2 / 3
Actionability	The skill provides highly concrete, actionable guidance: a specific evaluation workflow with clear steps, a defined output format with example markdown tables, specific verdicts with definitions, and concrete examples of good vs bad tests per framework. The anti-patterns cheat sheet gives specific patterns to flag with specific alternatives.	3 / 3
Workflow Clarity	The 5-step evaluation workflow is clearly sequenced and logical: read tests → read source → evaluate each test → identify gaps → produce report. Each step has explicit criteria and the output format serves as a validation checkpoint ensuring thoroughness. For a review/analysis task (non-destructive), this level of workflow clarity is excellent.	3 / 3
Progressive Disclosure	The content is well-structured with clear sections and headers, but it's a long monolithic document (~180 lines) with no references to external files. The language-specific guidance and anti-patterns table could be split into separate reference files to keep the main skill leaner, especially since not all reviews will involve all frameworks.	2 / 3
	Total	10 / 12 Passed

Description

82%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid skill description that clearly communicates its purpose and includes an explicit 'Use when' clause with multiple natural trigger terms. Its main weakness is that the specific capabilities could be more granular—listing concrete evaluation actions like checking assertion quality, identifying missing edge cases, or detecting over-mocking would strengthen it. The philosophical note about prioritizing real interactions over mocking adds useful distinctiveness.

Suggestions

Add more specific concrete actions the skill performs, e.g., 'identifies missing edge cases, flags excessive mocking, checks assertion quality, detects redundant tests'

Strengthen distinctiveness by explicitly differentiating from related skills like test writing or general code review, e.g., 'Does not write new tests; focuses on evaluating existing test suites'

Dimension	Reasoning	Score
Specificity	The description names the domain (test evaluation) and some actions ('reviewing test code against source code'), but doesn't list multiple specific concrete actions like identifying missing edge cases, checking assertion quality, evaluating coverage gaps, or detecting redundant tests.	2 / 3
Completeness	Clearly answers both 'what' (evaluate quality and efficacy of existing tests by reviewing test code against source code) and 'when' (explicit 'Use when' clause listing multiple trigger scenarios). Also includes a guiding principle about prioritizing real interactions over mocking.	3 / 3
Trigger Term Quality	Good coverage of natural terms users would say: 'review tests', 'test quality', 'audit test suites', 'test efficacy', 'testing things properly'. These are phrases users would naturally use when seeking this kind of help.	3 / 3
Distinctiveness Conflict Risk	While focused on test evaluation specifically, it could overlap with general code review skills or test writing skills. The distinction between 'reviewing tests' and 'writing tests' could cause some ambiguity, though the 'existing tests' qualifier helps somewhat.	2 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: mattermost/mattermost-ai-marketplace
Commit: 83f690a

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.