grove-maintain

Audit, upgrade, and maintain Grove test suites. Use when the user asks to "audit the test suite", "find untested examples", "upgrade dependencies", "check suite health", "find dead code", "clean up the test suite", "maintain Grove", "what examples are missing tests", or wants to analyze and improve the overall health of a Grove test suite.

0.97x

Quality

96%

Does it follow best practices?

Impact

86%

0.97x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Quality

Content

92%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A highly actionable, well-sequenced maintenance skill with concrete per-language commands and strong validation/feedback loops in its upgrade workflow. Its main weakness is progressive disclosure: everything lives inline in one large file with no reference files to offload the repeated per-language tables and templates.

Suggestions

Move the Language Reference table and the repeated per-mode command tables (dependency-check, formatter, release-notes) into a references/ file (e.g. references/languages.md) and link to it from each mode, shrinking the inline SKILL.md to an overview.

Extract the report markdown templates (Audit Report, Proposed Cleanup Actions) into a references/report-templates.md so the main body references them by link instead of inlining full blocks.

Consider a references/edge-cases.md for the Edge Cases section to keep the core three-mode workflow as the lean top-level surface.

Dimension	Reasoning	Score
Conciseness	The body is dense and assumes Claude's competence — it never explains what MongoDB, a test suite, or a dependency is, and the dominant content is compact command/grep tables and step lists; the few prose asides (e.g., why a hung run 'looks identical to a real regression') are high-signal operational judgment rather than padding, so it does not drop to 2.	3 / 3
Actionability	Provides copy-paste-ready, language-specific commands throughout (`npm outdated`, `./venv/bin/pip list --outdated`, `go list -m -u all`, ping one-liners, formatter commands) plus concrete grep patterns and report templates, matching the 'fully executable, copy-paste ready' anchor.	3 / 3
Workflow Clarity	Each of the three modes is a numbered sequence, and Upgrade mode adds explicit validation checkpoints (driver-ping DB reachability, baseline smoke test, wall-clock regression heuristic with 'stop and investigate', foreground single-file triage and re-run loops), satisfying the explicit-validation/feedback-loop anchor.	3 / 3
Progressive Disclosure	It is a single ~400-line monolithic SKILL.md with no bundle files in references/scripts/assets and all detail inline (per-language command tables and report templates repeated across modes); it is well-sectioned but content that could be split out is not, so it sits at the 'some structure, content that should be separate is inline' anchor rather than the top one.	2 / 3
	Total	11 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, third-person description that concisely states concrete capabilities and pairs them with an explicit, generous 'Use when' trigger list grounded in natural user phrasing. It clearly defines both what the skill does and when to invoke it, with low conflict risk.

Dimension	Reasoning	Score
Specificity	Names three concrete, distinct actions — 'Audit, upgrade, and maintain Grove test suites' — matching the 'lists multiple specific concrete actions' anchor.	3 / 3
Completeness	Explicitly answers both what ('Audit, upgrade, and maintain Grove test suites') and when (an explicit 'Use when the user asks to...' clause), hitting the top anchor for both halves.	3 / 3
Trigger Term Quality	Provides eight quoted natural-language triggers a user would actually say ('audit the test suite', 'find untested examples', 'upgrade dependencies', 'check suite health', 'find dead code', 'clean up the test suite', 'maintain Grove', 'what examples are missing tests'), giving strong coverage of common variations.	3 / 3
Distinctiveness Conflict Risk	Tied to a clear niche ('Grove test suites') with a distinct named trigger ('maintain Grove'), making conflict with unrelated skills unlikely; it would not score 2 because the domain and triggers are specific rather than merely 'somewhat specific'.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: mongodb/docs
Commit: 82345bd

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.