debugging-strategies

Master systematic debugging techniques, profiling tools, and root cause analysis to efficiently track down bugs across any codebase or technology stack. Use when investigating bugs, performance issues, or unexpected behavior.

0.98x

Quality

—

Does it follow best practices?

Impact

84%

0.98x

Average score across 6 eval scenarios

Securityby

Passed

No known issues

Quality

Content

27%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The content is an encyclopedic debugging reference packed with executable examples but is far too verbose and re-teaches concepts Claude already knows. It is a monolithic wall of text that should be split into reference files with a lean overview.

Suggestions

Cut the conceptual padding Claude already knows (Scientific Method, Rubber Duck Debugging, the 'Quick Debugging Checklist' of obvious items) and keep only the non-obvious, actionable guidance to reach a leaner token budget.

Split the per-language tooling (JS/TS, Python, Go) and advanced techniques into separate reference files (e.g. references/tools-js.md, references/tools-python.md) and keep SKILL.md as a concise overview with one-level-deep links, improving progressive disclosure.

Replace the open-question checklists inside each process phase with concrete steps and explicit validation/feedback checkpoints (e.g. 'reproduce fails -> broaden reproduction conditions -> retry') to lift workflow clarity.

Dimension	Reasoning	Score
Conciseness	The body re-explains concepts Claude already knows (the Scientific Method, Rubber Duck Debugging, 'Reproduce consistently', basic console.log/pdb usage, git bisect, and an obvious checklist of 'Spelling errors', 'Null/undefined values', 'Array index off-by-one'), padded across ~528 lines, matching the score-1 anchor of verbosity explaining concepts Claude knows.	1 / 3
Actionability	It includes genuinely executable code for JS/TS, Python, and Go plus real commands (git bisect, dlv, pprof), but large portions are abstract markdown question-lists ('What could be causing it?', 'What's the actual behavior?') rather than instruction, placing it at the score-2 'some concrete guidance but incomplete' anchor.	2 / 3
Workflow Clarity	A 4-phase 'Systematic Debugging Process' (Reproduce, Gather Information, Form Hypothesis, Test & Verify) provides a sequence, but each phase is a list of open questions rather than steps with explicit checkpoints or feedback loops, matching the score-2 'sequence present but checkpoints missing' anchor.	2 / 3
Progressive Disclosure	No bundle files exist and the entire 528-line body is a single monolithic file with all per-language tooling and advanced techniques inline, far exceeding the under-50-line threshold that would excuse the absence of references; this matches the score-1 'monolithic wall of text' anchor.	1 / 3
	Total	6 / 12 Passed

Description

77%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is strong on completeness and specificity, with a clear Use-when trigger and concrete capabilities. It is held back by generic breadth ('any codebase or technology stack') and limited trigger-term variation.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions: 'Master systematic debugging techniques, profiling tools, and root cause analysis to efficiently track down bugs', matching the score-3 anchor of multiple specific concrete actions.	3 / 3
Completeness	It explicitly answers both what ('Master systematic debugging techniques, profiling tools, and root cause analysis...') and when via an explicit 'Use when investigating bugs, performance issues, or unexpected behavior' clause, matching the score-3 anchor.	3 / 3
Trigger Term Quality	'investigating bugs, performance issues, or unexpected behavior' are natural terms, but coverage is thin and misses common variations like error, crash, stack trace, or memory leak that users would say, so it sits at the score-2 'some relevant keywords but missing common variations' anchor.	2 / 3
Distinctiveness Conflict Risk	The phrase 'across any codebase or technology stack' is broad, and debugging is a universal activity that could overlap with many code-oriented skills, so it lands at the score-2 'somewhat specific but could still overlap' anchor rather than a clearly distinct niche.	2 / 3
	Total	10 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (528 lines); consider splitting into references/ and linking	Warning

	Total	15 / 16 Passed

Repository: wshobson/agents
Commit: 5cc2549

Reviewed: about 2 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.