Interactive hypothesis-driven debugging with documented exploration, understanding evolution, and analysis-assisted correction.
34
19%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.codex/skills/debug-with-file/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description relies heavily on abstract, academic-sounding phrases without providing concrete actions, natural trigger terms, or explicit usage guidance. It fails to communicate what specific tasks the skill performs or when Claude should select it, making it nearly useless for skill selection among multiple options.
Suggestions
Replace abstract phrases with concrete actions, e.g., 'Systematically diagnoses code bugs by analyzing error messages, inspecting stack traces, forming hypotheses, and testing fixes.'
Add an explicit 'Use when...' clause with natural trigger terms like 'debug', 'fix bug', 'error', 'not working', 'crash', 'unexpected behavior'.
Clarify the distinctive niche — specify what makes this different from general code fixing, e.g., 'Use for complex, multi-step debugging that requires iterative investigation rather than simple one-line fixes.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses abstract, buzzword-heavy language like 'hypothesis-driven debugging', 'documented exploration', 'understanding evolution', and 'analysis-assisted correction' without listing any concrete actions. No specific operations (e.g., 'set breakpoints', 'analyze stack traces', 'inspect variables') are mentioned. | 1 / 3 |
Completeness | The description vaguely addresses 'what' with abstract concepts but provides no explicit 'when' clause or trigger guidance. There is no 'Use when...' or equivalent, and the 'what' itself is too vague to be useful. | 1 / 3 |
Trigger Term Quality | The terms used are academic/jargon-heavy ('hypothesis-driven', 'understanding evolution', 'analysis-assisted correction') rather than natural keywords a user would say. A user would say 'debug', 'fix bug', 'error', 'crash', 'not working' — none of which appear here. | 1 / 3 |
Distinctiveness Conflict Risk | The description is so vague that 'debugging' could overlap with any code-related skill, and the modifiers ('hypothesis-driven', 'documented exploration') don't narrow the scope to a clear niche. It's unclear what distinguishes this from general coding assistance or error fixing. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
39%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill has excellent workflow design with clear sequencing, mode detection, feedback loops, and error handling, but suffers severely from verbosity and lack of progressive disclosure. The content is roughly 4-5x longer than necessary, with duplicated workflow descriptions, inline templates that should be separate files, and pseudocode masquerading as executable code. The core debugging methodology is sound but buried under excessive detail.
Suggestions
Extract the understanding.md template, instrumentation templates, and NDJSON format documentation into separate bundle files referenced from the main SKILL.md to reduce length by ~60%.
Remove the duplicated 'Iteration Flow' section since it restates the 'Execution Process' section; keep only the more detailed version.
Replace pseudocode helper functions (extractErrorKeywords, analyzeSearchResults, evaluateEvidence) with either actual implementations or remove them and describe the expected behavior in 1-2 sentences each.
Remove the Chinese comment in Step 0 and consolidate the project root detection into a single code block without explanation of what git rev-parse does.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Massive amounts of template code that is pseudocode/illustrative rather than executable, repeated workflow descriptions (the iteration flow section largely duplicates the execution process section), and the understanding.md template is shown twice. Contains explanations Claude doesn't need (e.g., what NDJSON is, basic groupBy operations). The Chinese comment in Step 0 adds unnecessary language mixing. | 1 / 3 |
Actionability | Provides code templates and structured formats (NDJSON schema, hypothesis JSON), but most code is pseudocode with placeholder functions like `extractErrorKeywords()`, `analyzeSearchResults()`, `evaluateEvidence()`, and `removeDebugRegions()` that are never defined. The instrumentation templates (Python/JS) are the most concrete and executable parts, but the core logic relies on undefined helper functions. | 2 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced with explicit mode detection (explore/analyze/continue), validation checkpoints (hypothesis evaluation, verification after fix), feedback loops (fix doesn't work → iterate, all rejected → new hypotheses, >5 iterations → escalate), and an error handling table. The ASCII tree diagrams effectively communicate the flow. | 3 / 3 |
Progressive Disclosure | Monolithic wall of text with no references to external files despite being well over 400 lines. The understanding.md template, consolidation rules, debug log format, and instrumentation templates could all be separate reference files. Everything is inlined, making the skill extremely long and hard to navigate. No bundle files are provided to offload content. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (622 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
227244f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.