Interactive hypothesis-driven debugging with documented exploration, understanding evolution, and analysis-assisted correction.
34
19%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.codex/skills/debug-with-file/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is overly abstract and jargon-heavy, failing to communicate concrete actions, specific use cases, or trigger conditions. It reads more like an academic concept than a practical skill description. Claude would struggle to distinguish this from any other debugging-related skill or know when to select it.
Suggestions
Replace abstract phrases with concrete actions, e.g., 'Sets breakpoints, inspects variables, traces execution paths, and tests hypotheses to isolate bugs in code'.
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user asks for help debugging code, finding bugs, troubleshooting errors, or diagnosing unexpected behavior'.
Specify the domain or technology scope (e.g., Python debugging, web app debugging) to reduce conflict risk with other debugging-related skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses abstract, buzzword-heavy language like 'hypothesis-driven debugging', 'documented exploration', 'understanding evolution', and 'analysis-assisted correction' without listing any concrete actions. No specific operations or tools are mentioned. | 1 / 3 |
Completeness | The 'what' is vague and abstract, and there is no 'when' clause or explicit trigger guidance at all. There is no 'Use when...' or equivalent guidance for Claude to know when to select this skill. | 1 / 3 |
Trigger Term Quality | The terms used are academic/jargon-heavy ('hypothesis-driven', 'understanding evolution', 'analysis-assisted correction') and unlikely to match natural user queries. 'Debugging' is the only natural keyword, but it's buried in abstract phrasing. | 1 / 3 |
Distinctiveness Conflict Risk | The description is so vague that it could overlap with any debugging, troubleshooting, or code analysis skill. 'Debugging' alone is extremely broad, and the modifiers don't narrow the scope to a clear niche. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
39%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has a well-thought-out debugging workflow with clear sequencing and good error handling, but suffers severely from verbosity and repetition. The same flow is described three times in different formats, templates are duplicated, and many code examples are pseudocode stubs rather than executable code. The entire content should be split across multiple files with SKILL.md serving as a concise overview.
Suggestions
Reduce content by 60%+ by eliminating duplicate flow descriptions (the execution process, iteration flow, and implementation details all describe the same workflow) and moving templates/format specs to separate reference files.
Replace pseudocode stubs (extractErrorKeywords, analyzeSearchResults, evaluateEvidence, generateHypotheses) with actual executable implementations or remove them entirely if Claude is expected to implement them contextually.
Split into SKILL.md (overview + quick reference) with references to separate files: TEMPLATES.md (understanding.md template, instrumentation templates), FORMAT.md (NDJSON schema, hypotheses.json schema), and CONSOLIDATION.md (consolidation rules and examples).
Remove the mixed Chinese/English comments in Step 0 and the 'When to Use' / 'Key Features' sections which explain things Claude already knows from the skill's own context.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Massive amounts of template code that is pseudocode/illustrative rather than executable, redundant flow diagrams repeated multiple times (execution process, iteration flow), explanations of basic concepts Claude knows (what NDJSON is, how to parse JSON lines), and a 'When to Use' section that adds little value. The same workflow is described in at least 3 different places. | 1 / 3 |
Actionability | Provides code templates for instrumentation (Python/JS) that are fairly concrete, and the session setup logic is detailed. However, many critical functions are pseudocode stubs (extractErrorKeywords, analyzeSearchResults, evaluateEvidence, removeDebugRegions, generateHypotheses) that just return placeholder values. The skill relies on undefined helper functions and a '$BUG' variable without explaining the invocation mechanism clearly. | 2 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced with explicit mode detection (explore/analyze/continue), validation checkpoints (hypothesis evaluation, verification after fix), feedback loops (fix doesn't work → return to analyze mode, all rejected → generate new hypotheses), and an error handling table covering edge cases like >5 iterations and empty logs. | 3 / 3 |
Progressive Disclosure | Everything is crammed into a single monolithic file with no references to external documents. The understanding.md template is shown twice (once in Step 1.2 and again as a standalone template section). The session folder structure, NDJSON format, iteration flow, and consolidation rules could all be separate reference files. The content is a wall of text with heavy repetition. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (622 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
0f8e801
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.