Captures learnings, errors, corrections, and feature requests to enable continuous improvement. Use when: (1) User corrects Claude ('No, that's wrong...', 'Actually...'), (2) User requests a capability that doesn't exist, (3) Claude realizes its knowledge is outdated or incorrect, (4) A better approach is discovered for a recurring task, (5) Receiving a Handoff block from self-healing (a recurring verified heal at Recurrence-Count >= 3) to distill into a memory file or new skill. For ACTIVE runtime failures where the agent needs to apply and verify a fix mid-task, use `self-healing` instead (it files HEAL- entries with proof; self-improvement promotes accumulated patterns). Also review learnings before major tasks. For CI-only/headless learning capture, use self-improvement-ci.
82
75%
Does it follow best practices?
Impact
97%
2.48xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/self-improvement/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that thoroughly covers what the skill does, when to use it, and critically, when NOT to use it by pointing to sibling skills. The numbered trigger scenarios provide clear, actionable guidance for skill selection, and the natural language examples ('No, that's wrong...', 'Actually...') make trigger matching intuitive. The description is detailed without being padded, and uses appropriate third-person voice throughout.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: captures learnings, errors, corrections, and feature requests. Also specifies distilling into memory files or new skills, and reviewing learnings before major tasks. | 3 / 3 |
Completeness | Clearly answers both 'what' (captures learnings, errors, corrections, feature requests for continuous improvement) and 'when' with an explicit numbered list of five trigger scenarios plus boundary conditions distinguishing it from related skills like self-healing and self-improvement-ci. | 3 / 3 |
Trigger Term Quality | Includes highly natural trigger phrases users would actually say: 'No, that's wrong...', 'Actually...', 'capability that doesn't exist', 'outdated or incorrect', 'better approach'. These are realistic conversational patterns that map well to user intent. | 3 / 3 |
Distinctiveness Conflict Risk | Explicitly differentiates itself from 'self-healing' (active runtime failures) and 'self-improvement-ci' (CI-only/headless), creating clear boundaries. The specific trigger scenarios and the handoff mechanism from self-healing make it highly distinct. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is highly actionable with excellent concrete templates, commands, and examples, but is severely undermined by its verbosity. At 400+ lines, it includes extensive content that Claude doesn't need explained (detection trigger phrases, priority/area definitions, multi-agent setup details) and inlines content that should be in reference files. The core workflow of log → detect recurrence → promote is sound but buried under excessive detail.
Suggestions
Cut content by 50-60%: remove detection trigger phrases, priority/area tag tables, gitignore options, and multi-agent setup details — move these to reference files or drop entirely since Claude can infer them.
Move hook integration, skill extraction, and multi-agent support sections to separate reference files (e.g., references/hooks-setup.md, references/skill-extraction.md) and link from the main SKILL.md.
Add an explicit validation step after logging entries: e.g., 'Verify: grep for the new entry ID to confirm it was appended correctly and no duplicate IDs exist.'
Consolidate the three entry templates into a single compact format showing the shared structure once, with only the differing fields called out per type.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Explains concepts Claude already knows (what corrections look like, what feature requests are, priority definitions, area tag definitions). Includes installation instructions, gitignore options, multi-agent setup, hook configuration, and skill extraction workflows that bloat the file massively. Detection triggers listing phrases like 'No, that's not right...' are unnecessary for Claude. | 1 / 3 |
Actionability | Provides fully concrete, copy-paste-ready templates for every entry type (LRN, ERR, FEAT), executable bash commands for setup and review, complete JSON configurations for hooks, and specific grep commands for status checks. The guidance is highly specific and immediately usable. | 3 / 3 |
Workflow Clarity | The quick reference table provides good routing logic, and individual workflows (logging, promoting, resolving) are clear. However, the overall flow between logging → detecting recurrence → promoting lacks explicit validation checkpoints. The simplify-and-harden ingestion workflow has good sequencing but the main logging workflow doesn't verify entries were correctly formatted or deduplicated before committing. | 2 / 3 |
Progressive Disclosure | References to external files exist (references/hooks-setup.md, references/openclaw-integration.md, assets/SKILL-TEMPLATE.md) which is good progressive disclosure. However, no bundle files were provided to verify these exist, and the main SKILL.md still contains enormous amounts of inline content (hook configuration, multi-agent setup, skill extraction, gitignore options) that should be in reference files. The document is monolithic despite having some references. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (577 lines); consider splitting into references/ and linking | Warning |
Total | 10 / 11 Passed | |
f6c5d7b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.