Instinct-based learning system that observes sessions via hooks, creates atomic instincts with confidence scoring, and evolves them into skills/commands/agents.
64
26%
Does it follow best practices?
Impact
89%
1.58xAverage score across 6 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./docs/zh-TW/skills/continuous-learning-v2/SKILL.mdQuality
Discovery
17%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is overly abstract and jargon-heavy, reading more like an internal system architecture summary than a skill description meant to help Claude select the right tool. It lacks natural trigger terms users would say, has no explicit 'when to use' guidance, and the technical terminology ('atomic instincts', 'confidence scoring', 'hooks') obscures rather than clarifies the skill's purpose.
Suggestions
Add a 'Use when...' clause that specifies concrete scenarios, e.g., 'Use when the user wants to automatically learn patterns from sessions and generate reusable skills or commands.'
Replace jargon with natural language trigger terms users might say, such as 'learn from usage', 'auto-generate skills', 'session patterns', 'create automation from behavior'.
Describe concrete user-facing actions in plain language, e.g., 'Monitors coding sessions to identify repeated patterns, then generates reusable skills, CLI commands, or agent workflows based on observed behavior.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names some actions like 'observes sessions via hooks', 'creates atomic instincts with confidence scoring', and 'evolves them into skills/commands/agents', but these are somewhat abstract and jargon-heavy rather than concrete user-facing actions. | 2 / 3 |
Completeness | Provides a vague 'what' but completely lacks any 'when should Claude use it' guidance. There is no 'Use when...' clause or equivalent explicit trigger guidance, which per the rubric caps completeness at 2, and the weak 'what' brings it to 1. | 1 / 3 |
Trigger Term Quality | Uses highly technical jargon like 'atomic instincts', 'confidence scoring', 'hooks' that users would almost never naturally say. No natural trigger terms a user would use when needing this functionality. | 1 / 3 |
Distinctiveness Conflict Risk | The concept of 'instinct-based learning' is somewhat unique, but the mention of 'skills/commands/agents' is broad enough to potentially overlap with other meta-skills or automation tools. The jargon provides some distinctiveness but not through clear, well-defined boundaries. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
35%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill describes an ambitious instinct-based learning system but suffers from excessive verbosity and incomplete actionability. It spends many tokens on conceptual explanations, comparisons, and philosophy that Claude doesn't need, while failing to provide the actual implementation code for critical components like the observation hook script and observer agent. The workflow lacks validation checkpoints for what is essentially a background process pipeline.
Suggestions
Remove the v1 vs v2 comparison table, 'Why Hooks vs Skills' section, backward compatibility notes, and privacy section — these are explanatory content Claude doesn't need to execute the skill.
Provide the actual implementation of observe.sh and start-observer.sh, or at minimum show the expected input/output format so Claude can create them.
Add validation steps to the quick start: e.g., 'Verify hooks are working: check that observations.jsonl has entries after your next tool use' and error recovery guidance.
Move the confidence scoring details, integration notes, and file structure reference into separate linked files to reduce the main SKILL.md to a lean overview.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~200+ lines. It includes extensive explanatory content Claude doesn't need (why hooks vs skills, backward compatibility explanations, v1 vs v2 comparison tables, confidence scoring philosophy). The 'How it works' ASCII diagram, while visually appealing, adds significant token cost for information that could be conveyed more concisely. | 1 / 3 |
Actionability | The quick start section provides concrete JSON config and bash commands that are copy-paste ready. However, the actual hook script (observe.sh), observer agent script (start-observer.sh), and slash commands are referenced but never defined or shown — Claude wouldn't know how to create these critical components. The config.json is shown but there's no code for the actual observation/analysis logic. | 2 / 3 |
Workflow Clarity | The ASCII flow diagram shows the conceptual pipeline clearly, and the quick start has numbered steps. However, there are no validation checkpoints (e.g., how to verify hooks are working, how to confirm observations are being recorded, what to do if the observer agent fails). For a system involving background processes and file manipulation, this is a significant gap. | 2 / 3 |
Progressive Disclosure | The file structure section and references to external files (config.json, various directories) provide some structure. However, the SKILL.md itself is monolithic — the confidence scoring details, backward compatibility notes, privacy section, and v1-vs-v2 comparison could all be in separate referenced files. The external links at the bottom are helpful but the inline content is bloated. | 2 / 3 |
Total | 7 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
79cc4e3
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.