diagnose

Disciplined diagnosis loop for hard bugs and performance regressions. Reproduce → minimise → hypothesise → instrument → fix → regression-test. Use when the user says "diagnose this" / "debug this", reports a bug, says something is broken/throwing/failing, or describes a performance regression.

Quality

81%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Content

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured diagnostic methodology skill with excellent workflow clarity and phase gating. Its main weaknesses are the lack of executable code examples (relying on descriptive guidance rather than copy-paste commands) and some verbosity in explanatory asides. The referenced bundle files (hitl-loop.template.sh, etc.) are not provided, which weakens the progressive disclosure story.

Suggestions

Add at least 2-3 concrete, executable code/command examples — e.g., a sample bisection harness script, a curl-based feedback loop, or a tagged debug log pattern in a specific language.

Provide the referenced scripts/hitl-loop.template.sh file and consider extracting the 10 loop-construction techniques into a separate reference file to improve progressive disclosure.

Trim explanatory parentheticals in checklists (e.g., 'one user vs. all users, prod vs. dev') that Claude can infer from context.

Dimension	Reasoning	Score
Conciseness	The skill is well-written and mostly efficient, but some sections could be tightened. The Phase 0 checklist items include explanatory parentheticals that Claude doesn't need (e.g., 'one user vs. all users, prod vs. dev, recoverable vs. data-loss'). The Phase 1 list of 10 loop construction methods is thorough but some entries have unnecessary elaboration. Overall it respects Claude's intelligence but isn't maximally lean.	2 / 3
Actionability	The skill provides a highly structured methodology with concrete checklists and specific techniques, but lacks executable code examples. The feedback loop section lists approaches (curl scripts, Playwright, bisection) without providing copy-paste-ready commands or code snippets. The hypothesis format template is concrete and actionable, but most guidance remains at the 'what to do' level rather than 'here is the exact command'.	2 / 3
Workflow Clarity	Excellent multi-step workflow with six clearly sequenced phases, explicit gate conditions between phases ('Don't proceed until you reproduce', 'Don't proceed to Phase 2 until you have a loop'), validation checkpoints (Phase 2 confirmation checklist, Phase 6 cleanup checklist), and feedback loops (iterate on the loop itself, fix→test→verify cycle). The 'when you genuinely cannot build a loop' escape hatch is a strong error-recovery pattern.	3 / 3
Progressive Disclosure	The content references external files (scripts/hitl-loop.template.sh, CONTEXT.md, docs/adr/, ATTRIBUTION.md, improve-codebase-architecture) but no bundle files are provided to support these references. The skill itself is a single long document (~150 lines) that could benefit from splitting detailed technique lists (e.g., the 10 loop construction methods) into a reference file. The structure within the file is good with clear phase headers.	2 / 3
	Total	9 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly communicates both the methodology (a structured diagnosis loop) and when to use it (debugging, bug reports, failures, performance regressions). It uses third person voice, includes natural trigger terms, and is concise without being vague. The explicit step-by-step process and comprehensive trigger phrases make it highly effective for skill selection.

Dimension	Reasoning	Score
Specificity	Lists a concrete, multi-step methodology: 'Reproduce → minimise → hypothesise → instrument → fix → regression-test.' These are specific, actionable steps that clearly describe what the skill does.	3 / 3
Completeness	Clearly answers both 'what' (disciplined diagnosis loop with explicit steps) and 'when' (explicit 'Use when...' clause listing multiple trigger phrases and scenarios).	3 / 3
Trigger Term Quality	Includes a strong set of natural trigger terms users would actually say: 'diagnose this', 'debug this', 'bug', 'broken', 'throwing', 'failing', 'performance regression'. These cover common variations of how users describe problems.	3 / 3
Distinctiveness Conflict Risk	The focus on a structured diagnosis loop for hard bugs and performance regressions is a clear niche. The specific methodology and trigger terms distinguish it from general coding or testing skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: belchman/claude-skills
Commit: be88d6c

Reviewed: about 1 month ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.