gateway-diagnose

Diagnose gateway failures by reading daemon logs, session transcripts, Redis state, and OTEL telemetry. Full Telegram path triage: daemon process → Redis channel → command queue → pi session → model API → Telegram delivery. Use when: 'gateway broken', 'telegram not working', 'why is gateway down', 'gateway not responding', 'check gateway logs', 'what happened to gateway', 'gateway diagnose', 'gateway errors', 'review gateway logs', 'fallback activated', 'gateway stuck', or any request to understand why the gateway failed. Distinct from the gateway skill (operations) — this skill is diagnostic.

Quality

88%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the marks. It provides highly specific capabilities (reading daemon logs, Redis state, OTEL telemetry), includes a comprehensive set of natural trigger terms, explicitly addresses both what and when, and proactively distinguishes itself from a related gateway operations skill. The description is thorough without being padded, and uses proper third-person voice throughout.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: reading daemon logs, session transcripts, Redis state, OTEL telemetry, and describes the full triage path (daemon process → Redis channel → command queue → pi session → model API → Telegram delivery).	3 / 3
Completeness	Clearly answers both 'what' (diagnose gateway failures by reading logs, Redis state, telemetry, full Telegram path triage) and 'when' (explicit 'Use when:' clause with extensive trigger phrases). Also distinguishes itself from a related skill.	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms users would say: 'gateway broken', 'telegram not working', 'why is gateway down', 'gateway not responding', 'check gateway logs', 'fallback activated', 'gateway stuck', etc. These are realistic phrases a user would naturally use.	3 / 3
Distinctiveness Conflict Risk	Explicitly distinguishes itself from the gateway operations skill by stating 'Distinct from the gateway skill (operations) — this skill is diagnostic.' The focus on diagnosis, logs, and telemetry creates a clear niche that is unlikely to conflict with other skills.	3 / 3
	Total	12 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a highly actionable and well-structured diagnostic skill with excellent workflow clarity — the layered diagnostic approach with stop-at-first-failure semantics is ideal for triage. Its main weakness is length: at 400+ lines with all content inline, it consumes significant context window budget. Some interpretation paragraphs and edge-case scenarios could be more concise or split into reference files.

Suggestions

Extract the 'Known Failure Scenarios' section (scenarios 0-7) into a separate FAILURES.md reference file to reduce the main skill's token footprint while preserving discoverability.

Tighten the Layer 1 interpretation paragraphs — the multi-sentence explanations of channel-health, operator-tracing, and interruptibility could be condensed into bullet points or a table format.

Dimension	Reasoning	Score
Conciseness	The skill is extremely long (~400+ lines) with significant detail that is valuable for diagnosis but includes some verbose explanations (e.g., lengthy interpretation notes under Layer 1 CLI Status, detailed fallback controller state descriptions). Some sections like the channel-health/channel-healing interpretation paragraphs could be tightened. However, much of the content is genuinely necessary for a complex diagnostic workflow with many failure modes.	2 / 3
Actionability	Excellent actionability throughout — every diagnostic layer has concrete, copy-paste-ready bash commands, specific log patterns to grep for, structured error pattern tables with meanings and fixes, and clear expected outputs. The CLI commands, kubectl commands, curl tests, and log inspection commands are all fully executable.	3 / 3
Workflow Clarity	The diagnostic procedure is clearly sequenced as numbered layers (Layer -1 through Layer 8) with explicit instruction to 'stop and report at the first failure.' Each layer has specific failure patterns and validation criteria. The known failure scenarios include clear fix paths with verification steps. The 'start with diagnose, then review, only then manual logs' hierarchy is well-defined.	3 / 3
Progressive Disclosure	The skill references related skills (gateway, joelclaw-system-check, k8s) and ADRs, which is good. However, the document itself is monolithic — the 8 known failure scenarios, the error pattern table, the architecture reference, and the fallback controller state could potentially be split into separate reference files. With no bundle files provided, all content is inline in one very long document.	2 / 3
	Total	10 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: joelhooks/joelclaw
Commit: 03f0a59

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.