diagnose

Disciplined diagnosis loop for hard bugs and performance regressions. Reproduce → minimise → hypothesise → instrument → fix → regression-test. Use when user says "diagnose this" / "debug this", reports a bug, says something is broken/throwing/failing, or describes a performance regression.

0.97x

Quality

100%

Does it follow best practices?

Impact

90%

0.97x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Content

100%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A tightly written, highly actionable multi-phase diagnosis discipline with explicit validation checkpoints and a real, clearly signaled bundle script. It assumes Claude's competence and avoids filler, with no obvious weaknesses.

Dimension	Reasoning	Score
Conciseness	The body is dense with directives and sharp heuristics ("A 30-second flaky loop is barely better than no loop") and assumes Claude's competence without explaining known concepts, so it is not the padded level 2.	3 / 3
Actionability	Gives concrete, executable guidance — an ordered list of ten loop-construction strategies, specific commands (git bisect run, performance.now()), tagged-log convention [DEBUG-a4f2], and a real referenced script — beyond the incomplete level 2.	3 / 3
Workflow Clarity	Six sequenced phases with explicit progression gates ("Do not proceed to Phase 2 until you have a loop you believe in") and validation checklists/feedback loops, matching the explicit-checkpoints anchor rather than the checkpoint-gap level 2.	3 / 3
Progressive Disclosure	Content is well-organized into navigable Phase sections with the one bundle reference (scripts/hitl-loop.template.sh) clearly signaled and verified to exist one level deep, rather than the poorly-signaled level 2.	3 / 3
	Total	12 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A strong, third-person description that states concrete capabilities, gives broad natural trigger terms, and explicitly answers both what and when. It is distinct from general coding skills and risks no conflicts.

Dimension	Reasoning	Score
Specificity	Lists six concrete actions ("Reproduce → minimise → hypothesise → instrument → fix → regression-test") and names the domain ("hard bugs and performance regressions"), matching the multiple-specific-actions anchor rather than the partial level 2.	3 / 3
Completeness	Answers both what (the phased diagnosis loop) and when via an explicit "Use when…" clause, so it is not capped at 2 for a missing trigger clause.	3 / 3
Trigger Term Quality	Covers natural phrasings users would say — "diagnose this", "debug this", "reports a bug", "broken/throwing/failing", "performance regression" — broader than the some-keywords level 2.	3 / 3
Distinctiveness Conflict Risk	A clear niche (hard bugs and perf regressions) with distinct trigger phrasings unlikely to fire for unrelated skills, matching the clear-niche anchor rather than the overlapping level 2.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: coder/agent-tty
Commit: fae02cb

Reviewed: 23 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.