systematic-debugging

Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes

Quality

34%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.opencode/skills/systematic-debugging/SKILL.md

Quality

Discovery

14%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description only specifies when to use the skill but entirely omits what the skill actually does, making it incomplete and vague. The trigger terms cover some common scenarios but miss many natural variations. Without concrete actions or a clear niche, this skill would be difficult to distinguish from other debugging or troubleshooting skills.

Suggestions

Add concrete actions describing what the skill does, e.g., 'Systematically diagnoses root causes by analyzing stack traces, reproducing issues, and isolating failing components'.

Expand trigger terms to include common variations like 'error', 'exception', 'crash', 'debug', 'not working', 'broken', 'failing tests'.

Clarify the skill's distinct niche—what makes this debugging approach different from general troubleshooting? E.g., 'Applies a structured root-cause analysis methodology before proposing fixes'.

Dimension	Reasoning	Score
Specificity	The description does not list any concrete actions or capabilities. It says nothing about what the skill actually does—no verbs like 'analyze', 'diagnose', 'trace', or 'inspect'. It only describes when to use it, not what it does.	1 / 3
Completeness	The description answers 'when' (encountering bugs, test failures, unexpected behavior) but completely omits 'what' the skill does. There is no indication of the actions or methodology the skill provides.	1 / 3
Trigger Term Quality	It includes some natural trigger terms like 'bug', 'test failure', and 'unexpected behavior' that users might mention. However, it misses common variations like 'error', 'crash', 'exception', 'broken', 'not working', 'debug', or 'failing tests'.	2 / 3
Distinctiveness Conflict Risk	The description is extremely broad—any debugging, troubleshooting, or error-handling skill could match this. Without specifying what approach or actions it takes, it would conflict with many other skills that deal with bugs or errors.	1 / 3
	Total	5 / 12 Passed

Implementation

54%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill has excellent workflow structure with clear phases, gates, and escalation paths, and good progressive disclosure to supporting materials. However, it is significantly over-verbose, spending many tokens on motivational content, rationalizations tables, and advice Claude already knows (like 'read error messages' and 'don't skip past errors'). The actionability is moderate—the multi-component diagnostic example is strong but most guidance remains abstract process description.

Suggestions

Cut motivational/persuasion content (Common Rationalizations table, Real-World Impact statistics, 'your human partner's Signals' section) — Claude doesn't need to be convinced to follow its own skill instructions.

Remove obvious advice like 'Read Error Messages Carefully' and 'Don't skip past errors' — Claude already knows this. Focus tokens on the non-obvious techniques like the multi-component diagnostic instrumentation pattern.

Add more concrete, executable examples similar to the bash diagnostic example — e.g., specific git commands for 'Check Recent Changes', specific patterns for creating minimal reproduction test cases.

Consolidate the Red Flags list and Common Rationalizations table into a single brief 'Anti-patterns' section with 3-4 key items instead of 20+ overlapping bullet points.

Dimension	Reasoning	Score
Conciseness	Extremely verbose for a debugging methodology skill. Much of this content restates things Claude already knows (scientific method, 'read error messages carefully', 'don't skip past errors'). The 'Common Rationalizations' table, 'Red Flags' list, 'your human partner's Signals' section, and 'Real-World Impact' statistics are motivational padding rather than actionable guidance. The skill could be cut by 60%+ without losing useful information.	1 / 3
Actionability	The multi-component diagnostic instrumentation example with bash commands is concrete and useful. However, most of the skill is abstract process description ('read error messages carefully', 'form single hypothesis') rather than executable guidance. The phases describe what to do conceptually but lack specific tools, commands, or code patterns for most steps.	2 / 3
Workflow Clarity	The four-phase workflow is clearly sequenced with explicit gates between phases ('MUST complete each phase before proceeding'). Phase 4 includes validation checkpoints (create failing test, verify fix, stop-and-reassess after 3 failures). The escalation path from failed fixes to architectural questioning is a well-defined feedback loop.	3 / 3
Progressive Disclosure	References to supporting techniques (root-cause-tracing.md, defense-in-depth.md, condition-based-waiting.md) and related skills (test-driven-development, verification-before-completion) are clearly signaled and one level deep. The main content serves as an overview with appropriate pointers to detailed materials.	3 / 3
	Total	9 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: projectbluefin/dakota
Commit: f062bf8

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.