This description only specifies when to use the skill but entirely omits what the skill actually does, making it incomplete and vague. The trigger terms cover some common scenarios but miss many natural variations. Without concrete actions or a clear niche, this skill would be difficult to distinguish from other debugging or troubleshooting skills.

Suggestions

Add concrete actions describing what the skill does, e.g., 'Systematically diagnoses root causes by analyzing stack traces, reproducing issues, and isolating failing components'.

Expand trigger terms to include common variations like 'error', 'exception', 'crash', 'debug', 'broken', 'not working', 'failing tests'.

Clarify the skill's distinct approach or methodology to differentiate it from general debugging skills, e.g., 'Follows a structured root-cause analysis process before proposing any code changes'.

Dimension	Reasoning	Score
Specificity	The description does not list any concrete actions or capabilities. It says nothing about what the skill actually does—no verbs like 'analyze', 'diagnose', 'trace', or 'inspect'. It only describes when to use it, not what it does.	1 / 3
Completeness	The description answers 'when' (encountering bugs, test failures, unexpected behavior) but completely omits 'what' the skill does. There is no indication of the actions or methodology the skill provides.	1 / 3
Trigger Term Quality	It includes some natural trigger terms like 'bug', 'test failure', and 'unexpected behavior' that users might mention. However, it misses common variations like 'error', 'crash', 'exception', 'broken', 'not working', 'debug', or 'failing tests'.	2 / 3
Distinctiveness Conflict Risk	The description is extremely broad—any debugging, troubleshooting, or error-handling skill could match this. Without specifying what approach or actions it takes, it would conflict with many other skills that deal with bugs or errors.	1 / 3
	Total	5 / 12 Passed

Implementation

54%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has excellent workflow structure with clear phases, gates, and escalation paths, and good progressive disclosure to supporting files. However, it is severely bloated with motivational content, rationalizations tables, and process philosophy that Claude doesn't need. Much of the content describes general debugging wisdom rather than providing concrete, executable techniques.

Suggestions

Cut the content by 50-60%: remove the 'Common Rationalizations' table, 'Red Flags' list, 'your human partner's Signals' section, 'Real-World Impact' statistics, and all motivational framing ('Random fixes waste time', 'Symptom fixes are failure').

Replace abstract instructions like 'Read error messages carefully' and 'Form single hypothesis' with concrete examples showing specific tool commands (e.g., `git log --oneline -10`, `grep -rn 'error_code'`, specific debugger commands).

Consolidate the 'When to Use' and 'Don't skip when' sections into a single brief trigger line, since the skill description already states when to use it.

Dimension	Reasoning	Score
Conciseness	Extremely verbose at ~250+ lines for what is essentially a debugging methodology Claude already knows. The 'Common Rationalizations' table, 'Red Flags' list, 'your human partner's Signals' section, and motivational statements ('Random fixes waste time') are all unnecessary padding. The 'Real-World Impact' section with made-up statistics adds no actionable value.	1 / 3
Actionability	The multi-component diagnostic example with bash commands is concrete and useful, but most of the skill is abstract process guidance ('Read error messages carefully', 'Form single hypothesis') rather than executable techniques. The phases describe what to do conceptually but lack specific tooling commands or code patterns for most steps.	2 / 3
Workflow Clarity	The four-phase workflow is clearly sequenced with explicit gates between phases ('MUST complete each phase before proceeding'), includes validation checkpoints (Phase 3 verify step, Phase 4 test-first requirement), and has a clear feedback loop for failed fixes including an escalation path at 3+ failures.	3 / 3
Progressive Disclosure	Appropriately references supporting techniques in separate files (root-cause-tracing.md, defense-in-depth.md, condition-based-waiting.md) and related skills, all one level deep with clear signaling. The quick reference table provides a good overview for navigation.	3 / 3
	Total	9 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

1 day ago

Table of Contents

Discovery Implementation Validation