CtrlK
BlogDocsLog inGet started
Tessl Logo

grove-maintain

Audit, upgrade, and maintain Grove test suites. Use when the user asks to "audit the test suite", "find untested examples", "upgrade dependencies", "check suite health", "find dead code", "clean up the test suite", "maintain Grove", "what examples are missing tests", or wants to analyze and improve the overall health of a Grove test suite.

89

0.97x
Quality

88%

Does it follow best practices?

Impact

86%

0.97x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Grove: Audit and Maintain Test Suites

Begin your first response with: [grove-maintain-61ca57fa]

Analyze Grove test suites for gaps, outdated dependencies, and maintenance opportunities. This skill operates in three modes: audit, upgrade, and cleanup.

Do NOT use when:

  • The user wants to create, migrate, or fix a specific example → use /grove-create, /grove-migrate, or /grove-test
  • The user wants to run or debug a specific test → use /grove-run
  • The user needs to set up their local environment → use /grove-setup

Determine Mode

Parse the user's request to identify the mode:

  • Audit (default): Find gaps in test coverage, orphaned files, and structural issues
  • Upgrade: Check for outdated dependencies and apply updates
  • Cleanup: Find and remove dead code, fix formatting, clean up structure

If the user doesn't specify a mode, run audit mode.

Also determine the target language. If not specified, default to the language of the most recently discussed or modified files in this conversation. If no prior context exists, ask which language suite to target.

Language Reference

Use this table for all modes — it maps languages to their file extensions, base directories, export patterns, and import patterns for grepping:

SuiteBase dirExtExport patternImport pattern
JavaScriptjavascript/driver.jsexport (async )?functionfrom ['"].*examples/
Pythonpython/pymongo.py^def |^async def from examples\.
Gogo/driver.go^func [A-Z]"driver-examples/examples/
Javajava/driver-sync.javapublic .* \w+\(import .*examples\.
C#csharp/driver.cspublic .* \w+\(using .*Examples
Mongoshcommand-line/mongosh.js(none — raw shell commands)outputFromExampleFiles\(\[

All base dirs are under code-example-tests/.

Audit Mode

Step 1: Inventory Examples

Search for all example files in the language's examples directory:

code-example-tests/{base-dir}/examples/**/*.{ext}

Exclude template/stub files (e.g., example-stub.js, example_stub.py).

Record each file with its path, exported function names (grep using the export pattern from the language reference table above), and whether an output file exists alongside it (look for *-output.* or *_output.* in the same directory).

Step 2: Inventory Tests

Search for all test files:

code-example-tests/{base-dir}/tests/**/*.test.{ext}

For each test file, grep for import statements using the import pattern from the language reference table above to find which example files it references.

Step 3: Cross-Reference

Build a coverage map:

  1. Tested examples: Example files that are imported by at least one test
  2. Untested examples: Example files not imported by any test
  3. Orphaned tests: Test files that import example files that don't exist

Step 4: Check for Structural Issues

Look for:

  1. Empty directories: Topic directories with no files
  2. Inconsistent naming: Files that don't follow the language's naming conventions (check the CLAUDE.md in the language's driver directory for the expected file naming pattern — e.g., kebab-case.js for JS examples)
  3. Missing setup files: Tests that reference setup functions but the setup file doesn't exist

Step 5: Spot-Check Patterns

Review 5-10 example and test files — the 3 most recently modified (by git log --diff-filter=M --name-only -20 -- examples/ tests/), plus any files flagged in earlier steps. For each file, check:

  1. Consistent test lifecycle: Search for beforeAll|beforeEach|afterAll|afterEach across all test files. Flag if the suite mixes beforeAll and beforeEach for the same purpose (e.g., both used for DB setup in different files).
  2. Cross-topic imports: Search test files for imports from a different topic directory (e.g., a test in tests/crud/ importing from examples/indexes/). These create fragile dependencies.
  3. Config sanity: Check package.json (JS), go.mod (Go), or pom.xml (Java) and verify the main/entry point and test script are correct.

Add findings to the report under Additional Findings. If nothing stands out, skip this section.

Step 6: Report

Present findings in a structured format headed with Skill: grove-maintain:

## Audit Report: JavaScript/Node.js Driver

### Coverage Summary
- Total examples: X
- Tested: Y (Z%)
- Untested: A

### Untested Examples
| File | Exported Functions |
|------|-------------------|
| examples/crud/insert/bulk-insert.js | bulkInsertExample |

### Orphaned Tests
| Test File | Missing Import |
|-----------|---------------|
| tests/old/legacy.test.js | ../examples/old/legacy.js |

### Structural Issues
- Empty directory: examples/deprecated/

Upgrade Mode

Check for outdated dependencies, assess risk, and apply updates. Test failures after an upgrade usually mean examples need updating to reflect the new API — not that the upgrade should be reverted.

Step 0: Preflight

Before installing or running anything, verify the environment is in a state where a green baseline is possible. Skipping this step is the single biggest time-sink in practice — a hung test run caused by a stopped database or a missing venv looks identical to a real regression.

  1. Local workspace: Does the language's install target exist?

    • Python: test -d ./venv → if missing, create with python3 -m venv ./venv and ./venv/bin/pip install -r requirements.txt before proceeding. Tell the user you're creating it.
    • JavaScript: test -d node_modules → if missing, run npm install.
    • Java: the driver modules depend on a locally-installed comparison-library. From code-example-tests/java/, run mvn install -DskipTests once before any per-module test command — skipping this produces "cannot resolve symbol" errors that look like upgrade breakage but are really a missing local artifact.
    • Go/C#: package manager handles this, but note if the first command triggers a large download.
  2. Database reachability: Nearly all Grove test suites require a running MongoDB. Confirm with a driver-level ping rather than a port check — nc -zv only proves something is listening on the port, not that MongoDB is healthy or that the connection string and credentials resolve. A driver ping works identically for mongodb://localhost and Atlas SRV strings, so use one universal command per language. Examples (substitute the suite's actual URI env var):

    • Python: ./venv/bin/python -c "import os; from pymongo import MongoClient; MongoClient(os.environ['MONGODB_URI']).admin.command('ping')"
    • JavaScript: node -e "const {MongoClient}=require('mongodb');new MongoClient(process.env.MONGODB_URI).db().admin().ping().then(r=>console.log(r))"
    • Other languages: any one-liner that opens a client and runs db.adminCommand({ping: 1}).

    A successful ping returns in ~1 second; failure beats a 20-minute hang. If unreachable, stop and ask the user to start their DB or fix the connection string. Do not proceed.

  3. Baseline smoke test: Pick a single test file that exercises one minimal driver operation against a stable fixture — e.g., a basic find or countDocuments against an Atlas sample collection like sample_mflix.movies, or the suite's connection-test file if one exists. Avoid files with heavy per-test seeding, network mocks, or multi-collection setup; you want wall-clock time dominated by driver/IO, not fixture work. Run it on the current (pre-upgrade) pins and record the elapsed time — this is your reference for Step 4's regression heuristic. If the baseline already fails, stop: an upgrade can't fix a pre-existing break, and a failing baseline will make post-upgrade triage ambiguous.

Step 1: Check for Outdated Dependencies

Run the language's dependency check command:

LanguageCommand
JavaScriptcd code-example-tests/javascript/driver && npm outdated
Pythoncd code-example-tests/python/pymongo && ./venv/bin/pip list --outdated
Gocd code-example-tests/go/driver && go list -m -u all
Javacd code-example-tests/java/driver-sync && mvn versions:display-dependency-updates
C#cd code-example-tests/csharp/driver && dotnet list package --outdated

Capture the full list, not just the package the user mentioned. Even if the request is "upgrade pymongo," review every outdated direct dependency (ignore transitives — they move with their parent). Related tooling bumps (linters, test frameworks, dotenv libraries) are often cheap to bundle into the same PR and avoid a second round of environment churn. Present the full set to the user in Step 3 so they can decide the scope, rather than narrowing silently.

Step 2: Assess Risk

For each outdated dependency, categorize and investigate:

  • Critical (MongoDB driver, test framework): May require test updates. Check the package's release notes. Flag any entries marked "BREAKING" or "major".
  • Tooling (Bluehawk, linters, formatters): Check release notes for config format changes (e.g., new required fields). Safe to update if no config changes are noted.
  • Transitive: Usually handled by lockfile update. No investigation needed unless a security advisory is involved.

Use these paths to locate release notes:

PackageRelease Notes Location
MongoDB Node.js Driver (mongodb)npm info mongodb repository.url → GitHub releases
PyMongopip show pymongoHome-page → GitHub releases
Go Driver (go.mongodb.org/mongo-driver)GitHub releases at mongodb/mongo-go-driver
Java DriverGitHub releases at mongodb/mongo-java-driver
C# DriverGitHub releases at mongodb/mongo-csharp-driver
JestGitHub releases at jestjs/jest
JUnit 5GitHub releases at junit-team/junit5
NUnitGitHub releases at nunit/nunit
BluehawkGitHub releases at mongodb-university/Bluehawk

For other JavaScript packages, run npm info <package> repository.url to find the repo URL, then check its releases page.

Step 3: Propose Update Plan

Present a table of all outdated direct dependencies (from Step 1) with their risk assessments (from Step 2), and ask the user to pick the scope. Wait for approval before applying:

| Package | Current | Latest | Risk | Notes |
|---------|---------|--------|------|-------|
| mongodb | 7.1.0 | 7.2.0 | Critical | New aggregation operators |
| jest | 30.2.0 | 30.3.0 | Tooling | Patch release |
| bluehawk | 1.6.0 | 1.7.0 | Tooling | New tag support |

If the user's original request named only one package, still surface the full table. Explicitly ask whether to (a) bundle everything into one PR, (b) do only the named package, or (c) split tooling bumps into a separate follow-up. Do not silently expand the scope — but do not silently narrow it either.

Step 4: Apply Updates and Fix Examples

For each approved update:

  1. Update the dependency version in the config file
  2. Run the install command (npm install, pip install, etc.)
  3. Run the full test suite. Compare wall-clock runtime against the Step 0 baseline. If the suite takes >5× the baseline (or >5 min when baseline was seconds), stop and investigate — this almost always means tests are hitting a connection timeout (DB down, network, wrong URI) rather than a real regression. Run a single test file in the foreground to surface the traceback instead of waiting out the full suite.
  4. If all tests pass: Regenerate snippets by running the snip command for the target language (see the CLAUDE.md in the language's driver directory for the exact command). Note: for pure dependency bumps with no example source changes, snip is expected to produce zero content diffs — that is success, not a failure.
  5. If 1-3 tests fail: Investigate each. These are usually localized API changes (renamed method, changed default, new required option). Update the example code and expected output to match the new API, then re-run to confirm. Once tests pass, regenerate snippets.
  6. If 4+ tests fail: Report the scope to the user with the failure count, affected files, and relevant release notes. Let the user decide whether to fix inline or plan a dedicated pass. Do NOT automatically roll back — the old examples are now inaccurate against the version readers will be using.

Step 5: Update Version References in Documentation

After all approved upgrades are applied and tests pass, update hardcoded version numbers in the suite's documentation files:

  1. CLAUDE.md: Grep the suite's CLAUDE.md (e.g., code-example-tests/javascript/driver/CLAUDE.md) for the old version string and replace with the new version.
  2. Convention reference files: Grep the matching convention file at .claude/skills/grove-create/references/conventions-{language}.md for the old version string. These files typically do not hardcode versions, but check to be safe.

Only update version strings that refer to the package that was actually upgraded — do not blindly replace all occurrences of a version number.

Step 6: Workspace Cleanup

After the commit or PR, ask the user whether to remove any local workspace artifacts created during the upgrade (e.g., a venv you created in Step 0, a scratch log file, a freshly pulled node_modules). Do not remove them unilaterally — writers often keep local environments for subsequent work, and the ignore file typically excludes them from git anyway. A one-line prompt like "keep the venv, or remove it?" is enough.


Cleanup Mode

Find and remove dead code, fix formatting, and clean up structural issues. Do NOT execute any cleanup action without user approval.

Step 1: Find Unused Files

  1. Unused output files: Output files not referenced by any test's shouldMatch() or shouldResemble() call
  2. Unused setup files: Setup files not imported by any test
  3. Dead example files: Files in examples/ that are both untested AND not referenced by any snippet in content/code-examples/tested/

Step 2: Check Formatting

Run the language's formatter in check mode:

LanguageCommand
JavaScriptcd code-example-tests/javascript/driver && npx prettier --check examples/
Pythoncd code-example-tests/python/pymongo && python -m black --check examples/
Gogofmt -l examples/

Report any files with formatting issues.

Step 3: Check for Common Anti-Patterns

Grep example files for these specific anti-patterns:

Anti-patternGrep pattern (language-adjusted)
Hardcoded connection stringsmongodb(\+srv)?:// not wrapped in process.env / os.environ / equivalent
Missing resource cleanupFiles with MongoClient( but no client.close() / defer client.Disconnect / using
Empty catch blockscatch.*\{\s*\} / except.*:\s*pass
Lingering TODO markersTODO|FIXME|HACK

Report each hit with file path and line number. Do not modify — Step 4 gathers approvals.

Step 4: Propose Actions

Present cleanup actions and let the user approve each:

## Proposed Cleanup Actions

1. [ ] Delete unused output file: examples/old/legacy-output.txt
2. [ ] Delete empty directory: examples/deprecated/
3. [ ] Format 3 files with Prettier: examples/crud/insert/bulk.js, ...
4. [ ] Fix hardcoded URI in: examples/connect/basic.js

Execute only approved actions.

Step 5: Freeform Findings

After the structured checks, browse 5-10 example and test files — prioritizing files most recently modified (by git log date), then any files flagged in earlier steps — for issues the checklist didn't cover (e.g., broken config files, inconsistent patterns, duplicate logic, dead variables). Add any findings under an Additional Findings heading in the report. Skip if nothing stands out.

Step 6: Report

Summarize what was cleaned up and what remains.


Edge Cases

  • Empty suite: If the language's examples directory has no files, report that and stop — do not treat an empty directory as a coverage failure. Languages may be scaffolded before any examples exist.
  • Generated files under content/code-examples/tested/: These are build outputs of snip.js. Never flag them as untested or as cleanup candidates. They regenerate from examples/ and must not be edited directly.
  • Tests that import from tests/ rather than examples/: These are shared test helpers, not orphaned tests. Do not flag them as orphaned in audit mode.
  • Cross-topic imports that are intentional: Shared setup utilities (e.g., a common sample-data loader) legitimately live outside a single topic. Flag them to the user rather than auto-marking them as fragile.
  • Offline or private-registry failures during dependency check: If npm outdated, pip list --outdated, or equivalent fails with a network error, report the failure and ask the user to verify network/VPN access before retrying. Do not proceed as if there are no outdated packages.
  • Formatter not installed: If prettier, black, or gofmt is not available, skip that language's formatting check and note it in the report — do not block the rest of the cleanup pass.
  • Mongosh has no exported functions: Mongosh examples are raw shell commands — skip the export-pattern grep in audit Step 1 and cross-ref tests via the outputFromExampleFiles\(\[ pattern only.
Repository
mongodb/docs
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.