CtrlK
BlogDocsLog inGet started
Tessl Logo

systematic-debugging

Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

38

Quality

35%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/systematic-debugging/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Content

47%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has a well-structured four-phase workflow with clear sequencing and escalation paths, which is its strongest aspect. However, it is significantly over-verbose, repeating the core message ('don't guess, find root cause') in at least five different sections (Iron Law, Red Flags, Rationalizations, Partner Signals, Common Rationalizations). Most guidance is abstract process description rather than concrete, executable techniques, with the multi-component diagnostic bash example being a notable exception.

Suggestions

Cut the Red Flags, Common Rationalizations, and Partner Signals sections — they all repeat the same message. A single 3-line 'Anti-patterns' note would suffice.

Remove motivational/philosophical content ('Random fixes waste time', 'Symptom fixes are failure', Real-World Impact statistics) — Claude doesn't need persuading, it needs instructions.

Add concrete executable examples for Phase 2 (e.g., a git diff command sequence) and Phase 3 (e.g., a minimal test change pattern), not just the Phase 1 bash example.

Move the 'When to Use' and 'When Process Reveals No Root Cause' sections to a supporting file to keep the main skill lean and focused on the four phases.

DimensionReasoningScore

Conciseness

Extremely verbose for what it teaches. Much of the content is motivational/philosophical ('Random fixes waste time', 'Symptom fixes are failure'), repeats the same point many times (red flags, rationalizations, common excuses all say 'stop guessing'), and includes things Claude already knows (scientific method, 'read error messages carefully'). The rationalizations table, red flags section, and 'your human partner's signals' section are largely redundant with each other. Could be cut by 60%+ without losing actionable content.

1 / 3

Actionability

The multi-component diagnostic example with bash commands is concrete and useful. However, most of the skill is abstract process description ('Form Single Hypothesis', 'State clearly', 'Be specific') rather than executable guidance. The phases describe what to do conceptually but lack concrete code examples for most steps. The bash diagnostic example is the one bright spot.

2 / 3

Workflow Clarity

The four-phase workflow is clearly sequenced with explicit gates between phases ('MUST complete each phase before proceeding'). Phase 4 includes a concrete escalation path (3+ fixes → question architecture), feedback loops (hypothesis fails → new hypothesis → return to Phase 1), and validation checkpoints (verify before continuing, create failing test). The quick reference table provides a clear summary.

3 / 3

Progressive Disclosure

References to supporting files (root-cause-tracing.md, defense-in-depth.md, condition-based-waiting.md) and related skills are well-signaled at the bottom. However, the main SKILL.md itself is monolithic — the rationalizations table, red flags, and partner signals sections could be in a separate reference file. No bundle files were provided to verify referenced paths exist.

2 / 3

Total

8

/

12

Passed

Description

22%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description only addresses when to use the skill but entirely omits what the skill does, making it fundamentally incomplete. While it includes some useful trigger terms (bug, test failure, unexpected behavior), the lack of any concrete actions or capabilities means Claude cannot make an informed decision about whether this skill is appropriate. The description reads more like a usage note than a proper skill description.

Suggestions

Add concrete actions describing what the skill does, e.g., 'Systematically diagnoses root causes by analyzing error messages, tracing code paths, and reproducing issues' before the 'Use when...' clause.

Expand trigger terms to include common variations like 'error', 'crash', 'debug', 'not working', 'broken', 'exception', 'stack trace'.

Restructure to follow the pattern: '[What it does]. Use when [triggers].' to ensure both halves are clearly addressed.

DimensionReasoningScore

Specificity

The description does not list any concrete actions or capabilities. It says nothing about what the skill actually does — no verbs like 'analyze', 'diagnose', 'trace', etc. It only describes when to use it.

1 / 3

Completeness

The description answers 'when' (before proposing fixes for bugs/failures) but completely omits 'what' — there is no indication of what the skill actually does. This is the inverse of the typical problem but equally incomplete.

1 / 3

Trigger Term Quality

It includes some natural trigger terms like 'bug', 'test failure', and 'unexpected behavior' that users might naturally mention. However, it misses common variations like 'error', 'crash', 'broken', 'not working', 'debug', 'stack trace', etc.

2 / 3

Distinctiveness Conflict Risk

The trigger terms 'bug', 'test failure', and 'unexpected behavior' are fairly broad and could overlap with debugging, testing, or error-handling skills. The 'before proposing fixes' qualifier adds some distinctiveness but without knowing what the skill does, it's hard to differentiate it clearly.

2 / 3

Total

6

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
lucianghinda/superpowers-ruby
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.