Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.
81
77%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./clawdbot/self-improving-agent/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description with excellent trigger coverage and completeness. The 'Use when' clause is particularly well-crafted with six specific, natural scenarios. The main weakness is that the 'what' portion could be more specific about the concrete actions performed (e.g., where learnings are stored, what format they take).
Suggestions
Add more specific concrete actions to the 'what' portion, e.g., 'Logs errors to a learnings file, records user corrections, and indexes solutions for future reference' instead of the more abstract 'captures learnings, errors, and corrections'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (learnings, errors, corrections) and the general action (captures, enables continuous improvement), but the specific concrete actions are somewhat vague — 'captures learnings' and 'review learnings' are not as concrete as listing specific operations like 'logs error messages to a file' or 'updates a knowledge base with corrections'. | 2 / 3 |
Completeness | Clearly answers both 'what' (captures learnings, errors, and corrections for continuous improvement) and 'when' with an explicit, detailed 'Use when:' clause listing six specific trigger scenarios plus an additional guidance to review learnings before major tasks. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger phrases users would actually say: 'No, that's wrong...', 'Actually...', 'command fails', 'API fails', 'better approach', 'outdated', 'incorrect'. These are realistic conversational patterns that would naturally trigger skill selection. | 3 / 3 |
Distinctiveness Conflict Risk | This skill occupies a clear niche — meta-learning and error correction — that is unlikely to conflict with task-specific skills. The trigger scenarios (user corrections, failed operations, outdated knowledge) are distinctive and wouldn't overlap with typical domain-specific skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable skill with well-defined record types and concrete templates that Claude can immediately use. Its main weaknesses are moderate verbosity in the repetitive template structures and a thin workflow section that lacks validation checkpoints for file operations. The safety boundaries section is a strong addition that provides clear guardrails.
Suggestions
Add a validation step to the workflow, e.g., after logging an entry, verify the file was written and the entry ID is unique within the file.
Consider condensing the three record type templates by showing a shared base structure once and only highlighting the fields that differ per type, reducing repetition.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably well-structured but includes some verbose template fields (e.g., full markdown templates with every metadata field spelled out) that could be tightened. The record type templates are somewhat repetitive across the three types, though each does carry distinct information. | 2 / 3 |
Actionability | The skill provides concrete, copy-paste-ready markdown templates for each record type, specific file paths for storage, clear naming conventions (LRN-YYYYMMDD-XXX), and explicit field values (priority levels, status values, area categories). Claude can immediately act on this guidance. | 3 / 3 |
Workflow Clarity | The workflow section lists steps but lacks explicit validation checkpoints. For a system that involves creating/updating files and promoting learnings, there's no verification step to confirm entries were written correctly or that status updates are consistent. The promotion rules are clear but the core workflow is thin. | 2 / 3 |
Progressive Disclosure | All content is inline in a single file, which is borderline appropriate given the length (~100 lines). The three record type templates could potentially be split into a reference file to keep the main skill leaner, but the current organization with clear headers is functional. No external references are provided or signaled. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
6768672
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.