Rosetta debugging skill for errors, test failures, and unexpected behavior. Use proactively when encountering any issue. Ensures root cause investigation before attempting fixes.
56
64%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./instructions/r2/core/skills/debugging/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description provides a reasonable overview of a debugging skill with an explicit 'Use when' clause, which is good for completeness. However, it lacks specific concrete actions (what debugging techniques it applies) and uses overly broad trigger language ('any issue') that could cause conflicts with other skills. The trigger terms cover common but not comprehensive debugging-related vocabulary.
Suggestions
Add more specific concrete actions the skill performs, e.g., 'Analyzes stack traces, inspects logs, reproduces failures, and bisects code changes to identify root causes'.
Expand trigger terms to include natural user language like 'bug', 'crash', 'exception', 'broken', 'not working', 'stack trace', 'failing tests'.
Narrow the 'when' clause from 'any issue' to more specific scenarios to reduce conflict risk with other skills, e.g., 'Use when encountering runtime errors, failing tests, or when code produces unexpected output'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (debugging) and some actions ('root cause investigation before attempting fixes'), but the capabilities listed ('errors, test failures, unexpected behavior') are broad categories rather than specific concrete actions like 'analyze stack traces, inspect variable state, trace execution flow'. | 2 / 3 |
Completeness | Clearly answers both what ('debugging skill for errors, test failures, and unexpected behavior' with 'root cause investigation before attempting fixes') and when ('Use proactively when encountering any issue'), with an explicit trigger clause. | 3 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'errors', 'test failures', and 'unexpected behavior' that users might naturally encounter, but misses common variations like 'bug', 'crash', 'exception', 'stack trace', 'broken', 'not working', or 'failing'. | 2 / 3 |
Distinctiveness Conflict Risk | The term 'Rosetta' adds some distinctiveness, but 'any issue' is extremely broad and could overlap with general coding assistance, testing, or troubleshooting skills. The scope of 'errors, test failures, and unexpected behavior' covers a very wide range that could conflict with more specialized skills. | 2 / 3 |
Total | 9 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid debugging methodology skill with a well-structured sequential workflow and good validation checkpoints. Its main weaknesses are the lack of concrete executable examples (no code snippets showing diagnostic logging, hypothesis documentation, or test creation) and some redundancy between sections (best_practices repeating phase content, validation checklist partially duplicating phase steps). The skill would benefit from concrete examples and trimming repeated guidance.
Suggestions
Add a concrete example showing a debugging scenario end-to-end: sample error → hypothesis → diagnostic logging code → root cause → fix → test
Remove the 'best_practices' section as its three points are already covered in phases 1 and 3, reducing redundancy
Provide a concrete example of diagnostic logging at component boundaries (e.g., a code snippet showing boundary tracing)
Link or reference the 'load-context skill' explicitly so the dependency is navigable
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Mostly efficient but includes some unnecessary elements like the 'role' tag, 'best_practices' section that largely repeats earlier content, and some verbose phrasing. The validation checklist partially duplicates the phases. However, it doesn't over-explain concepts Claude already knows. | 2 / 3 |
Actionability | Provides a clear methodology with specific steps (git diff, diagnostic logging, sequence diagrams), but lacks concrete executable examples — no code snippets, no specific commands beyond 'git diff', no example of what a hypothesis statement looks like, no example of diagnostic logging. It's instructional rather than copy-paste ready. | 2 / 3 |
Workflow Clarity | Clear four-phase sequential workflow (investigation → pattern analysis → hypothesis testing → implementation) with explicit validation checkpoints. Includes feedback loops (if fix fails, form new hypothesis; if 3+ fixes fail, stop and reassess architecture). The validation checklist provides a clear completion gate. | 3 / 3 |
Progressive Disclosure | Content is reasonably structured with phases and sections, but everything is in a single file with no references to external resources. The mention of 'load-context skill' suggests dependencies but doesn't link to them. For a debugging skill of this complexity, the inline approach is acceptable but the reference to external skills without links is a gap. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
0f3002b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.