grove-test

Create or fix tests for existing Grove code examples. Use when the user asks to "add a test", "create a test", "fix this test", "update the test", "the test doesn't match the output", "test is failing", or wants to add test coverage for an existing example or fix a broken test.

Quality

83%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill with clear multi-step workflows and good validation checkpoints. Its main weakness is verbosity in the handoff validation section (Step 0) which consumes significant token budget for what is essentially an edge case. The references to external convention files and sibling skills demonstrate good progressive disclosure intent, though the absence of bundle files makes it impossible to verify those references.

Suggestions

Consider moving the Step 0 handoff file validation logic (version check, shape check, trigger schemas) into a separate reference file to reduce the main skill's token footprint — this is an edge case that doesn't need to be in the primary flow.

Tighten the Step 4 heuristics — the 'fewer than 8 it blocks' rule and the add-vs-create decision tree could be expressed more concisely as a brief decision table rather than prose.

Dimension	Reasoning	Score
Conciseness	The skill is reasonably efficient but includes some sections that could be tightened — the handoff file validation in Step 0 is quite verbose with extensive error message templates, and some explanations (e.g., when to add vs create test files in Step 4) could be more concise. However, it generally avoids explaining concepts Claude already knows.	2 / 3
Actionability	The skill provides highly concrete, actionable guidance: specific file paths, exact bash commands for running tests, a clear validation approach table, a concrete error fix walkthrough with actual error output, and specific thresholds (8+ it blocks, max 3 retry attempts). The guidance is specific enough to execute without ambiguity.	3 / 3
Workflow Clarity	The 8-step workflow is clearly sequenced with explicit validation checkpoints: Step 7 includes a retry loop with a max of 3 attempts and clear failure reporting, Step 3 (Fix mode) requires running the test first before diagnosing, and Step 5 (Fix mode) explicitly warns against modifying example files without user confirmation. The feedback loop (Step 7 → Step 5, max 3 attempts) is well-defined.	3 / 3
Progressive Disclosure	The skill references external convention files (e.g., `conventions-{language}.md`, `/grove-run` Step 3, `/grove-create`) which is good progressive disclosure, but no bundle files are provided to verify these references exist. The skill itself is quite long (~200+ lines) and some content like the full handoff validation logic in Step 0 could potentially be split into a reference file. The inline content is well-structured with clear headers though.	2 / 3
	Total	10 / 12 Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-crafted description that excels in trigger term coverage and completeness, with a clear 'Use when...' clause containing numerous natural user phrases. The main weakness is that the capability description could be slightly more specific about what actions are performed beyond 'create or fix tests'. Overall, it would perform well in skill selection among a large set of skills.

Dimension	Reasoning	Score
Specificity	The description names the domain ('tests for existing Grove code examples') and two main actions ('create or fix tests'), but doesn't elaborate on specific concrete sub-actions like generating assertions, comparing expected output, or updating test fixtures.	2 / 3
Completeness	Clearly answers both 'what' (create or fix tests for existing Grove code examples) and 'when' (explicit 'Use when...' clause with multiple specific trigger phrases and scenarios like adding test coverage or fixing broken tests).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger phrases users would actually say: 'add a test', 'create a test', 'fix this test', 'update the test', 'the test doesn't match the output', 'test is failing', and 'add test coverage'. These are highly natural and varied.	3 / 3
Distinctiveness Conflict Risk	Scoped specifically to 'Grove code examples' testing, which is a clear niche. The combination of 'Grove' domain specificity and test-focused triggers makes it unlikely to conflict with general testing or general Grove coding skills.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: mongodb/docs
Commit: be9d4af

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.