Add integration/E2E tests to existing codebase using Design Docs
56
47%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/recipe-add-integration-tests/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is too terse and lacks explicit trigger guidance ('Use when...'), which is critical for skill selection. While it identifies a specific niche (integration/E2E testing from design docs), it fails to enumerate concrete actions or provide the natural language triggers that would help Claude reliably select this skill over others.
Suggestions
Add an explicit 'Use when...' clause with trigger terms like 'add integration tests', 'end-to-end tests', 'E2E testing', 'test coverage from design doc', 'write tests based on spec'.
List specific concrete actions such as 'Reads design documents to identify test scenarios, generates integration and end-to-end test files, sets up test fixtures and assertions for existing codebases'.
Include common keyword variations users might use: 'end-to-end', 'e2e', 'acceptance tests', 'test plan', 'design document', 'spec-based testing'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (integration/E2E tests) and a key action (add tests using Design Docs), but doesn't list multiple concrete actions like generating test cases, setting up test infrastructure, or writing assertions. | 2 / 3 |
Completeness | Provides a partial 'what' (add integration/E2E tests using Design Docs) but completely lacks a 'when should Claude use it' clause. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' itself is also thin, warranting a 1. | 1 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'integration', 'E2E tests', and 'Design Docs', but misses common variations users might say such as 'end-to-end', 'test coverage', 'testing', 'test suite', or 'acceptance tests'. | 2 / 3 |
Distinctiveness Conflict Risk | The combination of 'integration/E2E tests' with 'Design Docs' provides some distinctiveness, but 'existing codebase' is generic and could overlap with general testing skills or code generation skills. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill defines a well-structured orchestration workflow with clear step sequencing, feedback loops, and gate conditions, which is its strongest aspect. However, it suffers from moderate verbosity (philosophical framing, repeated routing explanations) and incomplete actionability—agent invocations describe parameters abstractly rather than showing exact tool call syntax. The balance between inline content and external references could be improved.
Suggestions
Remove the 'Core Identity' and 'Why Delegate' sections—these explain orchestration philosophy that Claude doesn't need; replace with a single line like 'Delegate all implementation work to subagents via Agent tool.'
Show one complete, concrete Agent tool invocation example with exact syntax rather than describing parameters abstractly across every step.
Extract the repeated routing logic ('*-backend-task-*' → specific subagent_type) into a single routing table at the top, then reference it in Steps 4, 6, and 7 instead of repeating it.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is moderately efficient but includes some unnecessary verbosity, such as the 'Why Delegate' rationale section and repeated explanations of routing patterns. The orchestrator identity statement is philosophical rather than actionable. | 2 / 3 |
Actionability | Provides structured steps with specific tool invocations and file naming conventions, but relies heavily on abstract agent delegation patterns rather than executable code. Key details like the actual Agent tool invocation syntax, commit message format, and coverage requirements are left vague or missing. | 2 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced (Steps 0-8) with explicit validation checkpoints: Step 3 is marked as a GATE, Step 5 provides review with approve/revise feedback loop (Step 6 returns to Step 5), and Step 7 provides final quality assurance before committing. The per-task-file sequential execution through Steps 4→5→6→7 is clearly stated. | 3 / 3 |
Progressive Disclosure | References external skills (documentation-criteria, monorepo-flow.md) and subagents appropriately, but the task file template is inlined when it could be referenced, and the document classification logic adds bulk. The skill itself is fairly long without clear pointers to separate reference materials. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
2e719be
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.