grove-maintain

Audit, upgrade, and maintain Grove test suites. Use when the user asks to "audit the test suite", "find untested examples", "upgrade dependencies", "check suite health", "find dead code", "clean up the test suite", "maintain Grove", "what examples are missing tests", or wants to analyze and improve the overall health of a Grove test suite.

0.97x

Quality

88%

Does it follow best practices?

Impact

86%

0.97x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted, highly actionable skill with excellent workflow clarity across all three modes (audit, upgrade, cleanup). Its main weakness is length — at ~350 lines with detailed tables and extensive inline guidance, it could benefit from splitting reference material (grep patterns, dependency commands, release notes URLs) into separate files. The edge cases section is a strong addition that demonstrates real-world experience with the domain.

Suggestions

Extract the language reference table, release notes locations table, and anti-pattern grep patterns into a separate reference file (e.g., `references/language-matrix.md`) to reduce the main skill's token footprint.

Trim explanatory prose in Upgrade Step 0 — the rationale for driver-level ping over port check and the explanation of why baselines matter could be shortened to single-line comments, trusting Claude to understand the reasoning.

Dimension	Reasoning	Score
Conciseness	The skill is thorough and mostly efficient for its complexity, but some sections are verbose — e.g., the Upgrade Mode Step 0 preflight section explains at length why `nc -zv` is insufficient, and Step 2 includes a large table of release notes locations that could be a separate reference file. Some guidance (like explaining what transitive dependencies are) assumes less competence than necessary.	2 / 3
Actionability	Highly actionable throughout: concrete grep patterns, specific CLI commands per language, exact file paths, structured report templates, and clear decision trees (e.g., 1-3 failures vs 4+ failures). The language reference table with export/import patterns is immediately executable.	3 / 3
Workflow Clarity	All three modes have clearly numbered, sequenced steps with explicit validation checkpoints. Upgrade mode includes a preflight baseline, runtime regression heuristic (5× baseline), explicit approval gates before applying changes, and clear branching logic for different failure counts. Cleanup mode requires user approval before executing any action.	3 / 3
Progressive Disclosure	The skill is a long monolithic document (~350 lines) with three full modes inline. The language reference tables, release notes locations, and anti-pattern grep patterns could be split into separate reference files. However, the internal structure with clear headings and mode separation is well-organized, and it does reference external files (CLAUDE.md, convention files) appropriately.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that follows best practices. It uses third person voice, provides specific concrete actions, includes an explicit 'Use when' clause with extensive natural trigger terms, and is clearly scoped to a distinct domain (Grove test suites). The description effectively balances conciseness with comprehensive trigger coverage.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Audit, upgrade, and maintain Grove test suites' covers auditing, upgrading dependencies, and maintaining. The trigger phrases further elaborate specific capabilities like finding untested examples, checking suite health, finding dead code, and cleaning up.	3 / 3
Completeness	Clearly answers both 'what' (audit, upgrade, and maintain Grove test suites) and 'when' (explicit 'Use when' clause with extensive trigger phrases covering multiple user intent variations).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms users would say: 'audit the test suite', 'find untested examples', 'upgrade dependencies', 'check suite health', 'find dead code', 'clean up the test suite', 'maintain Grove', 'what examples are missing tests'. These are natural phrases a user would actually type.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive due to the specific 'Grove' domain qualifier and the focused niche of test suite auditing/maintenance. The combination of 'Grove' + test suite health/audit/maintenance creates a clear, non-conflicting niche that is unlikely to overlap with general testing or general code maintenance skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: mongodb/docs
Commit: be9d4af

Reviewed: 1 day ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.