Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes
Install with Tessl CLI
npx tessl i github:obra/superpowers --skill systematic-debuggingOverall
score
64%
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Discovery
15%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description fails to explain what the skill actually does, focusing only on when to use it. The trigger conditions are overly broad ('any bug, test failure, or unexpected behavior') making it likely to conflict with other debugging-related skills. Without knowing the skill's concrete actions, Claude cannot make informed decisions about when to select it.
Suggestions
Add concrete actions describing what the skill does (e.g., 'Systematically diagnoses root causes by analyzing stack traces, reproducing issues, and isolating variables').
Narrow the scope to reduce conflict risk - specify what type of debugging approach this represents (e.g., 'binary search debugging', 'log analysis', 'hypothesis-driven investigation').
Expand trigger terms with natural variations users would say: 'error', 'crash', 'not working', 'broken', 'failing tests', 'exception'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague language like 'any bug, test failure, or unexpected behavior' without describing concrete actions. It doesn't specify what the skill actually does - only when to use it. | 1 / 3 |
Completeness | The description only addresses 'when' to use the skill but completely omits 'what' the skill does. There's no indication of the actual capabilities or actions performed. | 1 / 3 |
Trigger Term Quality | Contains some natural keywords users might say ('bug', 'test failure', 'unexpected behavior'), but these are fairly generic debugging terms that could apply to many contexts. Missing specific variations like 'error', 'crash', 'failing tests', 'broken'. | 2 / 3 |
Distinctiveness Conflict Risk | Extremely generic scope covering 'any bug' or 'unexpected behavior' would conflict with virtually any debugging, testing, or troubleshooting skill. No clear niche is established. | 1 / 3 |
Total | 5 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, well-structured debugging skill with excellent workflow clarity and actionability. The four-phase approach with explicit validation checkpoints and the '3+ fixes = architectural problem' heuristic are particularly valuable. Minor verbosity in the red flags and rationalizations sections could be consolidated without losing clarity.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is comprehensive but includes some redundancy (multiple tables restating similar concepts, repeated 'STOP' warnings). The rationalization table and red flags section overlap significantly. Could be tightened while preserving clarity. | 2 / 3 |
Actionability | Provides concrete, executable guidance with specific bash examples for diagnostic instrumentation, clear phase-by-phase instructions, and explicit decision criteria (e.g., '≥3 fixes = question architecture'). The multi-layer debugging example is copy-paste ready. | 3 / 3 |
Workflow Clarity | Excellent multi-step workflow with explicit phases, clear success criteria table, validation checkpoints ('MUST complete each phase before proceeding'), and feedback loops ('Didn't work? Form NEW hypothesis'). The 3+ fixes threshold provides clear escalation path. | 3 / 3 |
Progressive Disclosure | Well-structured with clear overview, phases broken into digestible sections, and appropriate references to supporting techniques (root-cause-tracing.md, defense-in-depth.md) and related skills. Navigation is straightforward with one-level-deep references. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
88%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 14 / 16 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata' field is not a dictionary | Warning |
license_field | 'license' field is missing | Warning |
Total | 14 / 16 Passed | |
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.