Use after resolving a bug, failed task, or unexpected agent behavior to improve the pipeline skills, agents, hooks, or scripts that contributed to the problem. Also proactively suggest improvements when recurring patterns or inefficiencies are observed.
81
72%
Does it follow best practices?
Impact
93%
1.25xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/improvement-loop/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has good completeness with clear 'what' and 'when' clauses, which is its strongest aspect. However, the actions described are somewhat vague ('improve' rather than specific concrete actions), and the trigger terms, while relevant, could overlap with debugging or CI/CD skills. The description would benefit from more specific actions and more distinctive terminology.
Suggestions
Replace vague verbs like 'improve' with concrete actions such as 'update skill definitions', 'refactor hook logic', 'add error handling to scripts', or 'document failure patterns'.
Add more distinctive trigger terms to reduce conflict risk, such as 'postmortem', 'retrospective', 'pipeline refinement', or 'agent self-improvement' to clearly differentiate from general debugging or CI/CD skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names a domain (improving pipeline skills, agents, hooks, scripts) and some actions (resolve, improve, suggest improvements), but the actions are somewhat vague—'improve' is not a concrete action like 'refactor', 'update configuration', or 'add error handling'. | 2 / 3 |
Completeness | The description clearly answers both 'what' (improve pipeline skills, agents, hooks, or scripts) and 'when' (after resolving a bug, failed task, or unexpected agent behavior; also when recurring patterns or inefficiencies are observed). The 'Use after...' and 'Also proactively suggest...' clauses serve as explicit trigger guidance. | 3 / 3 |
Trigger Term Quality | Includes some relevant terms like 'bug', 'failed task', 'agent behavior', 'pipeline', 'hooks', 'scripts', and 'recurring patterns', but misses common user phrasings like 'postmortem', 'root cause', 'fix', 'debug', 'retrospective', or 'lesson learned'. | 2 / 3 |
Distinctiveness Conflict Risk | The scope is somewhat specific to post-incident improvement of pipeline components, but terms like 'bug', 'failed task', and 'improve scripts' could overlap with debugging skills, CI/CD skills, or general code improvement skills. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted process skill with excellent workflow clarity and actionability — the gate check, five-step cycle, routing table, and anti-drift guardrails provide genuinely useful structure for a complex meta-task. Its main weakness is verbosity: the core message is repeated in multiple forms, and the graphviz diagrams, while visually appealing, consume significant tokens for information that could be conveyed more efficiently. The monolithic structure would benefit from splitting some sections into referenced files.
Suggestions
Reduce repetition of the 'fix first, improve last' principle — state it once prominently and reference it rather than restating in Overview, Gate, Proactive Detection, Red Flags, and Key Insight sections.
Replace the graphviz dot diagrams with compact bullet-point decision trees to save tokens while preserving clarity.
Consider splitting the batching and proactive detection sections into a referenced supplementary file, keeping SKILL.md focused on the core five-step cycle.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably well-structured but verbose for its purpose. The graphviz diagrams add visual clarity but consume tokens for what could be simple bullet lists. Several sections repeat the same core message ('fix first, improve last') multiple times, and the batching/red flags sections overlap with earlier content. Some tables are efficient, but overall it could be tightened by ~30%. | 2 / 3 |
Actionability | The skill provides highly concrete, actionable guidance: specific git commit message formats, exact conversation templates for asking users, a clear routing table mapping change types to tools/agents, and a detailed five-step cycle with specific verification methods per change type. The classification table and routing table are particularly executable. | 3 / 3 |
Workflow Clarity | The workflow is exceptionally clear with an explicit mandatory gate check before any improvement work, a well-sequenced five-step cycle, verification steps mapped to each change type, and explicit feedback loops (fix → verify → re-validate). The anti-drift section adds guardrails against common failure modes. Destructive/risky operations (pipeline edits) have appropriate validation checkpoints. | 3 / 3 |
Progressive Disclosure | The skill references external skills (writing-skills, writing-agents) and agents (cc-orchestration-writer, bash-script-craftsman) appropriately, but all content is inline in a single long file with no references to supplementary documents. The routing table hints at external resources but doesn't link to them. For a skill of this length (~200+ lines), some content (e.g., the batching section, proactive detection templates) could be split into referenced files. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
0d67646
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.