Curates insights from reflections and critiques into CLAUDE.md using Agentic Context Engineering
33
17%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/reflexion/skills/memorize/SKILL.mdQuality
Discovery
7%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description relies heavily on abstract buzzwords ('Agentic Context Engineering', 'curates insights') without specifying concrete actions or when the skill should be triggered. It lacks a 'Use when...' clause and natural user-facing keywords, making it difficult for Claude to reliably select this skill from a pool of alternatives.
Suggestions
Replace vague language with concrete actions, e.g., 'Extracts key lessons from session reflections and critique logs, then appends structured entries to CLAUDE.md preferences and guidelines sections.'
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user asks to update CLAUDE.md, save learnings, record preferences, or incorporate feedback from past sessions.'
Remove or define jargon like 'Agentic Context Engineering' — users will not use this phrase, and it adds no discriminative value for skill selection.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, buzzword-heavy language like 'curates insights' and 'Agentic Context Engineering' without listing concrete actions. It does not specify what actions are performed (e.g., appending entries, summarizing feedback, updating sections). | 1 / 3 |
Completeness | The description vaguely addresses 'what' (curates insights into CLAUDE.md) but provides no 'when' clause or explicit trigger guidance. There is no 'Use when...' or equivalent, which per the rubric should cap completeness at 2 at best, but the 'what' is also too vague to earn a 2. | 1 / 3 |
Trigger Term Quality | The terms 'reflections', 'critiques', 'Agentic Context Engineering' are not natural keywords a user would say. A user is unlikely to request 'curating insights' or mention 'Agentic Context Engineering' in a prompt. 'CLAUDE.md' is somewhat specific but insufficient on its own. | 1 / 3 |
Distinctiveness Conflict Risk | The mention of 'CLAUDE.md' provides some specificity that narrows the domain, but 'reflections and critiques' and 'insights' are broad enough to overlap with other skills related to documentation, note-taking, or self-improvement workflows. | 2 / 3 |
Total | 5 / 12 Passed |
Implementation
27%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is a comprehensive but overly verbose conceptual framework for memory consolidation. Its main weakness is extreme verbosity — it explains many concepts Claude already understands (what curation means, what anti-patterns are, why quality matters) and includes aspirational sections like 'Expected Outcomes' that consume tokens without adding actionable value. The workflow structure is reasonable but would benefit from concrete validation steps and splitting content across multiple files.
Suggestions
Cut content by at least 50%: remove the <role>/<task>/<context> XML preamble, 'Expected Outcomes' section, 'Memory Anti-Patterns to Avoid' examples, and 'Implementation Notes' — Claude already knows these concepts.
Make validation steps concrete: replace abstract quality gates like 'Coherence Check' and 'Actionability Test' with specific, executable checks (e.g., 'grep for duplicate headings', 'verify no bullet exceeds 2 lines').
Split into multiple files: move the CLAUDE.md section template, curation rules, and transformation examples into separate reference files, keeping SKILL.md as a concise overview with clear pointers.
Replace fictional CLI flags (--dry-run, --max=5, --section) with actual implementation or remove them — they suggest functionality that doesn't exist and mislead the agent.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~250+ lines. Extensively explains concepts Claude already knows (what curation is, what anti-patterns are, what 'good code' means). The role/task/context XML preamble, 'Expected Outcomes' section, 'Memory Anti-Patterns to Avoid' examples like 'Write good code (not actionable)', and 'Implementation Notes' are all padding that Claude doesn't need. The content could be reduced by 60-70% without losing actionable guidance. | 1 / 3 |
Actionability | Provides some concrete guidance like the CLAUDE.md section structure, the transformation example (Map vs Object), and the curation rules. However, much of the content is descriptive rather than executable — there are no real bash commands that work, the usage CLI flags are aspirational/fictional, and the workflow is more of a conceptual framework than copy-paste-ready instructions. | 2 / 3 |
Workflow Clarity | The four-phase workflow (Harvest → Curate → Update → Validate) is clearly sequenced and logically ordered. However, validation steps are described abstractly ('Coherence Check', 'Actionability Test') without concrete verification commands or measurable criteria. The feedback loop for conflicting bullets is mentioned but not operationalized with clear decision rules. | 2 / 3 |
Progressive Disclosure | Monolithic wall of text with no bundle files to reference. All content — from high-level overview to detailed templates, quality indicators, implementation notes, and expected outcomes — is crammed into a single file. Content like the CLAUDE.md section templates, quality gate checklists, and example transformations could be split into separate reference files. | 1 / 3 |
Total | 6 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
dedca19
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.