Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes
61
61%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
14%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description only specifies when to use the skill but entirely omits what the skill actually does, making it incomplete and vague. The trigger terms cover some common scenarios but miss many natural variations. Without concrete actions or a clear niche, this skill would be difficult to distinguish from other debugging or troubleshooting skills.
Suggestions
Add concrete actions describing what the skill does, e.g., 'Systematically diagnoses root causes by analyzing stack traces, reproducing issues, and isolating failing components'.
Expand trigger terms to include common variations like 'error', 'exception', 'crash', 'debug', 'broken', 'not working', 'failing tests'.
Clarify the skill's distinct approach or methodology to differentiate it from general debugging skills, e.g., 'Follows a structured root-cause analysis process before proposing any code changes'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description does not list any concrete actions or capabilities. It says nothing about what the skill actually does—no verbs like 'analyze', 'diagnose', 'trace', or 'inspect'. It only describes when to use it, not what it does. | 1 / 3 |
Completeness | The description answers 'when' (encountering bugs, test failures, unexpected behavior) but completely omits 'what' the skill does. There is no indication of the actions or methodology the skill provides. | 1 / 3 |
Trigger Term Quality | It includes some natural trigger terms like 'bug', 'test failure', and 'unexpected behavior' that users might mention. However, it misses common variations like 'error', 'crash', 'exception', 'broken', 'not working', 'debug', or 'failing tests'. | 2 / 3 |
Distinctiveness Conflict Risk | The description is extremely broad—any debugging, troubleshooting, or error-handling skill could match this. Without specifying what approach or actions it takes, it would conflict with many other skills that deal with bugs or errors. | 1 / 3 |
Total | 5 / 12 Passed |
Implementation
54%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has excellent workflow structure with clear phases, gates, and escalation paths, and good progressive disclosure to supporting files. However, it is severely bloated with motivational content, rationalizations tables, and process philosophy that Claude doesn't need. Much of the content describes general debugging wisdom rather than providing concrete, executable techniques.
Suggestions
Cut the content by 50-60%: remove the 'Common Rationalizations' table, 'Red Flags' list, 'your human partner's Signals' section, 'Real-World Impact' statistics, and all motivational framing ('Random fixes waste time', 'Symptom fixes are failure').
Replace abstract instructions like 'Read error messages carefully' and 'Form single hypothesis' with concrete examples showing specific tool commands (e.g., `git log --oneline -10`, `grep -rn 'error_code'`, specific debugger commands).
Consolidate the 'When to Use' and 'Don't skip when' sections into a single brief trigger line, since the skill description already states when to use it.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~250+ lines for what is essentially a debugging methodology Claude already knows. The 'Common Rationalizations' table, 'Red Flags' list, 'your human partner's Signals' section, and motivational statements ('Random fixes waste time') are all unnecessary padding. The 'Real-World Impact' section with made-up statistics adds no actionable value. | 1 / 3 |
Actionability | The multi-component diagnostic example with bash commands is concrete and useful, but most of the skill is abstract process guidance ('Read error messages carefully', 'Form single hypothesis') rather than executable techniques. The phases describe what to do conceptually but lack specific tooling commands or code patterns for most steps. | 2 / 3 |
Workflow Clarity | The four-phase workflow is clearly sequenced with explicit gates between phases ('MUST complete each phase before proceeding'), includes validation checkpoints (Phase 3 verify step, Phase 4 test-first requirement), and has a clear feedback loop for failed fixes including an escalation path at 3+ failures. | 3 / 3 |
Progressive Disclosure | Appropriately references supporting techniques in separate files (root-cause-tracing.md, defense-in-depth.md, condition-based-waiting.md) and related skills, all one level deep with clear signaling. The quick reference table provides a good overview for navigation. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents