systematic-debugging

Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

Install with Tessl CLI

npx tessl i github:obra/superpowers --skill systematic-debugging

What are skills?

Overall
score

64%

Review — 50%

Does it follow best practices?

If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:

npx tessl skill review --optimize ./path/to/skill

Learn more

Validation — 14 / 16 Passed

Validation for skill structure

SKILL.md

Review

Evals

Discovery

15%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description fails to explain what the skill actually does, focusing only on when to use it. The trigger conditions are overly broad ('any bug, test failure, or unexpected behavior') making it likely to conflict with other debugging-related skills. Without knowing the skill's concrete actions, Claude cannot make informed decisions about when to select it.

Suggestions

Add concrete actions describing what the skill does (e.g., 'Systematically diagnoses root causes by analyzing stack traces, reproducing issues, and isolating variables').

Narrow the scope to reduce conflict risk - specify what type of debugging approach this represents (e.g., 'binary search debugging', 'log analysis', 'hypothesis-driven investigation').

Expand trigger terms with natural variations users would say: 'error', 'crash', 'not working', 'broken', 'failing tests', 'exception'.

Dimension	Reasoning	Score
Specificity	The description uses vague language like 'any bug, test failure, or unexpected behavior' without describing concrete actions. It doesn't specify what the skill actually does - only when to use it.	1 / 3
Completeness	The description only addresses 'when' to use the skill but completely omits 'what' the skill does. There's no indication of the actual capabilities or actions performed.	1 / 3
Trigger Term Quality	Contains some natural keywords users might say ('bug', 'test failure', 'unexpected behavior'), but these are fairly generic debugging terms that could apply to many contexts. Missing specific variations like 'error', 'crash', 'failing tests', 'broken'.	2 / 3
Distinctiveness Conflict Risk	Extremely generic scope covering 'any bug' or 'unexpected behavior' would conflict with virtually any debugging, testing, or troubleshooting skill. No clear niche is established.	1 / 3
	Total	5 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, well-structured debugging skill with excellent workflow clarity and actionability. The four-phase approach with explicit validation checkpoints and the '3+ fixes = architectural problem' heuristic are particularly valuable. Minor verbosity in the red flags and rationalizations sections could be consolidated without losing clarity.

Dimension	Reasoning	Score
Conciseness	The skill is comprehensive but includes some redundancy (multiple tables restating similar concepts, repeated 'STOP' warnings). The rationalization table and red flags section overlap significantly. Could be tightened while preserving clarity.	2 / 3
Actionability	Provides concrete, executable guidance with specific bash examples for diagnostic instrumentation, clear phase-by-phase instructions, and explicit decision criteria (e.g., '≥3 fixes = question architecture'). The multi-layer debugging example is copy-paste ready.	3 / 3
Workflow Clarity	Excellent multi-step workflow with explicit phases, clear success criteria table, validation checkpoints ('MUST complete each phase before proceeding'), and feedback loops ('Didn't work? Form NEW hypothesis'). The 3+ fixes threshold provides clear escalation path.	3 / 3
Progressive Disclosure	Well-structured with clear overview, phases broken into digestible sections, and appropriate references to supporting techniques (root-cause-tracing.md, defense-in-depth.md) and related skills. Navigation is straightforward with one-level-deep references.	3 / 3
	Total	11 / 12 Passed

Validation

88%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 14 / 16 Passed

Validation for skill structure

Criteria	Description	Result
metadata_version	'metadata' field is not a dictionary	Warning
license_field	'license' field is missing	Warning

	Total	14 / 16 Passed

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.