Enforces systematic root-cause investigation before any fix attempt, using a four-phase process (investigate, analyze patterns, hypothesize, implement). Use when encountering bugs, test failures, build errors, unexpected behavior, performance problems, or integration issues — especially when under pressure, when a "quick fix" seems obvious, or when previous fix attempts have failed. DO NOT TRIGGER for new feature development or refactoring without a known defect.
82
77%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/debugging/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly communicates a specific debugging methodology, provides comprehensive trigger terms covering common debugging scenarios, and explicitly defines both inclusion and exclusion criteria. The four-phase process naming adds specificity, and the 'DO NOT TRIGGER' clause is a strong differentiator that reduces conflict risk with other development-related skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions via the four-phase process (investigate, analyze patterns, hypothesize, implement) and clearly describes the methodology as 'systematic root-cause investigation before any fix attempt.' | 3 / 3 |
Completeness | Clearly answers both 'what' (enforces systematic root-cause investigation using a four-phase process) and 'when' (explicit 'Use when...' clause with multiple trigger scenarios, plus a 'DO NOT TRIGGER' exclusion clause for additional clarity). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would encounter: 'bugs', 'test failures', 'build errors', 'unexpected behavior', 'performance problems', 'integration issues', 'quick fix', 'previous fix attempts have failed.' These are terms users naturally use when describing debugging scenarios. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive — focuses specifically on root-cause debugging methodology rather than general coding or fixing. The explicit exclusion of 'new feature development or refactoring without a known defect' further reduces conflict risk with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
54%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has excellent workflow structure with clear phases, gates, validation checkpoints, and escalation paths. However, it is significantly over-verbose — repeating the same 'don't skip the process' message in at least 5 different ways (When to Use, Red Flags, Common Rationalizations, Human Partner Signals, Quick Reference). The actionability is moderate: the multi-component diagnostic example is strong, but most phases offer general checklists rather than concrete, executable techniques.
Suggestions
Cut the 'Common Rationalizations' table, 'Red Flags' list, and 'Human Partner Signals' section — these all say the same thing ('follow the process') and waste ~60 lines of context window. A single 2-line reminder suffices.
Collapse the 'When to Use' and 'Don't skip when' sections into a single brief trigger list — the description metadata already covers activation criteria.
Add more concrete, executable examples to Phases 2 and 3 (e.g., specific git commands for checking recent changes, a template for writing down hypotheses in code comments or a scratch file).
Remove the 'Real-World Impact' section — unverifiable statistics that Claude doesn't need for execution.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~250+ lines. Extensive sections on rationalizations, red flags, 'human partner signals,' and motivational content ('systematic is faster than thrashing') are things Claude already knows or doesn't need repeated multiple ways. The 'When to Use' and 'Don't skip when' sections largely restate the same point. The 'Common Rationalizations' table and 'Red Flags' section overlap significantly. | 1 / 3 |
Actionability | The four-phase process provides a structured approach with some concrete examples (the multi-layer diagnostic bash script is good), but most guidance is procedural/philosophical rather than executable. Phases 2-4 are largely checklists of general advice ('Find working examples,' 'Form single hypothesis') rather than concrete, copy-paste-ready commands or code patterns. | 2 / 3 |
Workflow Clarity | The four-phase workflow is clearly sequenced with explicit gates between phases ('MUST complete each phase before proceeding'). Phase 4 includes validation steps (create failing test, verify fix, escalation after 3 failures), and there are clear feedback loops (return to Phase 1 if hypothesis fails, question architecture after 3+ failures). | 3 / 3 |
Progressive Disclosure | References to supporting techniques (root-cause-tracing.md, defense-in-depth.md, condition-based-waiting.md) and related skills (kit:tdd, kit:verify) are clearly signaled and one level deep. The quick reference table provides a good overview, and detailed content is appropriately organized into phases. | 3 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
a01bac9
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.