Investigate OpenClaw pnpm test memory growth, Vitest OOMs, RSS spikes, and heap snapshot deltas.
58
66%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.agents/skills/openclaw-test-heap-leaks/SKILL.mdQuality
Discovery
54%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is highly specific to a narrow project and problem domain (OpenClaw test memory issues), which gives it excellent distinctiveness and good trigger terms. However, it lacks an explicit 'Use when...' clause and reads more as a topic label than a description of concrete actions the skill performs, weakening its completeness and specificity.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when debugging memory leaks in OpenClaw tests, when Vitest runs out of memory, or when analyzing heap snapshots for test suites.'
Describe concrete actions the skill performs, e.g., 'Captures heap snapshots, compares RSS before/after test runs, identifies leaking allocations, and recommends Vitest configuration changes to prevent OOMs.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (OpenClaw pnpm test memory issues) and some actions (investigate memory growth, OOMs, RSS spikes, heap snapshot deltas), but these are more like symptoms/topics than concrete actions the skill performs. | 2 / 3 |
Completeness | Describes what (investigate memory issues) but has no explicit 'Use when...' clause or equivalent trigger guidance. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' is also somewhat weak, making this a 1. | 1 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would actually say: 'memory growth', 'Vitest OOMs', 'RSS spikes', 'heap snapshot deltas', 'pnpm test', and the project name 'OpenClaw'. These are specific terms a developer would use when encountering these issues. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive due to the specific project name 'OpenClaw', specific tooling (pnpm, Vitest), and narrow problem domain (test memory growth, OOMs). Very unlikely to conflict with other skills. | 3 / 3 |
Total | 9 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable skill tailored to a specific codebase's memory debugging workflow. Its greatest strengths are the concrete commands, clear decision framework for classifying memory growth, and well-sequenced workflow with validation checkpoints. Minor weaknesses include some redundancy between the heuristics and workflow sections, and the content could benefit from splitting the runtime-fix validation into a separate file for better progressive disclosure.
Suggestions
Consider extracting the 'Validating runtime fixes' section into a separate referenced file (e.g., RUNTIME-LEAK-HARNESS.md) since it covers a distinct use case from the main test-memory workflow.
Trim the Heuristics section to remove points already stated in the workflow steps (e.g., the point about missing timings files is covered in step 4).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient and domain-specific, but some sections are slightly verbose — e.g., the heuristics section restates points already covered in the workflow, and some bullet points could be tightened. However, it largely avoids explaining things Claude already knows and stays focused on repo-specific knowledge. | 2 / 3 |
Actionability | The skill provides exact commands (pnpm invocations with specific env vars), concrete script paths, specific file paths for config/timings, exact CLI flags for the delta script, and clear decision criteria for classifying growth types. Guidance is copy-paste ready and specific to this codebase. | 3 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced with explicit validation checkpoints (step 2: wait for repeated snapshots, step 3: classify before fixing, step 5: verify with direct proof). It includes feedback loops — e.g., confirming with retainers/dominators before declaring root cause, falling back to RSS if snapshot overhead is too high. The classification step (step 3) acts as a decision gate before the fix step. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear sections (Workflow, Heuristics, Snapshot Comparison, Validating runtime fixes, Output Expectations), but it's a fairly long single file (~100+ lines of substantive content). The runtime fix validation section could potentially be a separate referenced file. No bundle files are provided, so there are no external references to evaluate, but the skill references scripts that presumably exist in the bundle directory. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
65a805a
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.