test

Strategy-first testing for the developer inner loop. Derive tests from acceptance criteria, write at the right layer, run and interpret, debug flakiness and ordering issues. Match the conventions already in the codebase — your tests should look like they belong.

Quality

52%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Medium

Suggest reviewing before use

Fix and improve this skill with Tessl

tessl review fix ./develop/skills/test/SKILL.md

Quality

Content

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, thoughtfully written testing skill that excels at workflow clarity and strategic framing. Its main weakness is the lack of concrete, executable examples — the body reads as a philosophy-of-testing guide rather than an actionable reference, with all specifics deferred to reference files that aren't available in the bundle. The writing is clean but could be tighter in places where stance-setting and failure modes repeat earlier points.

Suggestions

Add at least one concrete, executable test example (e.g., a sample AC → derived test case with actual code) in the Plan or Write section so the SKILL.md is actionable on its own without requiring reference files.

Include a minimal example of the plan output format (e.g., a table or checklist showing AC → test case → layer → status markers) to make the planning step copy-paste ready.

Trim the 'Your Stance' and 'Failure Modes' sections — several points (e.g., 'don't guess ACs', 'strategy before writing') are stated in both places; consolidate to reduce redundancy.

Dimension	Reasoning	Score
Conciseness	The content is mostly efficient and well-written, but includes some philosophical framing ('Your sharpest move...', 'A test that passes forever regardless of code changes is noise — not signal') and stance-setting that, while valuable for tone, adds tokens beyond what's strictly necessary for actionable guidance. The failure modes section restates ideas already covered earlier.	2 / 3
Actionability	The skill provides a clear conceptual framework (four moves, plan from ACs, layer selection) but lacks concrete executable examples — no code snippets, no specific commands, no example test output. It describes what to do at a strategic level but delegates all concrete guidance to reference files that aren't provided in the bundle.	2 / 3
Workflow Clarity	The four-move workflow (Plan → Write → Run → Debug) is clearly sequenced with explicit validation checkpoints: verify the test fails for the right reason before implementing, run broader suite to check regressions, interpret failures honestly with a decision tree (code wrong vs test wrong vs intermittent). The feedback loops (gap → refinement, flake → debug mode) are well-defined.	3 / 3
Progressive Disclosure	The skill references four sub-files (references/plan.md, references/write.md, references/run.md, references/debug.md) plus two artifact files, which is good structure. However, no bundle files were provided, so we can't verify these references exist or contain useful content. The main file delegates nearly all concrete/actionable detail to these references, making the SKILL.md itself more of a table of contents than a self-contained overview with actionable quick-start content.	2 / 3
	Total	9 / 12 Passed

Description

42%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description demonstrates strong specificity with multiple concrete actions related to test writing and debugging, and it conveys a clear philosophy ('strategy-first', 'match conventions'). However, it completely lacks a 'Use when...' clause, which significantly hurts completeness and makes it harder for Claude to know when to select this skill. The trigger terms are somewhat jargon-heavy and miss common natural language variations users would employ.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to write tests, add test coverage, create unit/integration tests, or debug failing/flaky tests.'

Include more natural trigger terms users would say, such as 'unit test', 'integration test', 'test coverage', 'TDD', 'test-driven', 'pytest', 'jest', 'spec', or 'test suite'.

Consider replacing jargon like 'developer inner loop' with more universally understood phrasing to improve discoverability.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Derive tests from acceptance criteria', 'write at the right layer', 'run and interpret', 'debug flakiness and ordering issues', 'Match the conventions already in the codebase'. These are concrete, actionable capabilities.	3 / 3
Completeness	Describes what the skill does reasonably well, but there is no explicit 'Use when...' clause or equivalent trigger guidance. Per the rubric, a missing 'Use when...' clause should cap completeness at 2, and the 'when' is not even implied clearly enough to warrant a 2 — it's entirely absent.	1 / 3
Trigger Term Quality	Contains some relevant keywords like 'tests', 'acceptance criteria', 'flakiness', 'debug', and 'inner loop', but misses common user terms like 'unit test', 'integration test', 'test coverage', 'TDD', 'pytest', 'jest', or file extensions. 'Strategy-first testing for the developer inner loop' uses somewhat jargon-heavy phrasing.	2 / 3
Distinctiveness Conflict Risk	The focus on testing is a recognizable niche, and details like 'acceptance criteria', 'flakiness', and 'ordering issues' help distinguish it. However, without explicit trigger conditions, it could overlap with general coding or debugging skills.	2 / 3
	Total	8 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: audenaert/etak
Path: develop/skills/test/SKILL.md
Commit: 632c389

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.