debug

Investigate stuck runs and execution failures by tracing Symphony and Codex logs with issue/session identifiers; use when runs stall, retry repeatedly, or fail unexpectedly.

1.00x

Quality

88%

Does it follow best practices?

Impact

83%

1.00x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid debugging skill with highly actionable commands and clear workflow sequences for tracing stuck runs through logs. Its main weakness is redundancy: the Quick Triage, Investigation Flow, and Reading Codex Session Logs sections cover overlapping ground, inflating token cost without proportional value. Consolidating these into a single unified workflow would improve both conciseness and progressive disclosure.

Suggestions

Merge 'Quick Triage', 'Investigation Flow', and 'Reading Codex Session Logs' into a single unified workflow to eliminate redundant steps and reduce token usage.

Remove the duplicated reminder about pairing session findings with issue_identifier/issue_id — state it once in a prominent location.

Dimension	Reasoning	Score
Conciseness	The content is mostly efficient but has some redundancy. The 'Reading Codex Session Logs' section substantially overlaps with the 'Investigation Flow' and 'Quick Triage' sections, repeating the same lifecycle pattern and session tracing steps. The final reminder to 'pair session findings with issue_identifier/issue_id' is stated twice.	2 / 3
Actionability	Provides concrete, copy-paste-ready rg commands with specific log patterns, field names, and file paths. The investigation steps reference exact log message strings (e.g., 'Issue stalled ... restarting with backoff', 'Codex session failed') making it immediately executable.	3 / 3
Workflow Clarity	The Quick Triage and Investigation Flow sections provide clear, numbered sequences with explicit classification steps and validation (step 4: validate scope, step 5: capture evidence). The workflow includes feedback loops for narrowing searches and distinguishing failure classes.	3 / 3
Progressive Disclosure	The content references `elixir/docs/logging.md` appropriately for conventions, but the skill itself is somewhat monolithic with three overlapping workflow sections (Quick Triage, Investigation Flow, Reading Codex Session Logs) that could be consolidated. No bundle files are provided, so the single-file approach is acceptable but the internal organization could be tighter.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-crafted skill description that clearly defines its scope, uses concrete actions, and includes explicit trigger conditions. It names specific tools (Symphony, Codex), specific artifacts (logs, issue/session identifiers), and specific failure scenarios, making it highly distinguishable and easy for Claude to select appropriately.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions: 'investigate stuck runs', 'execution failures', 'tracing Symphony and Codex logs', 'issue/session identifiers'. These are specific, actionable capabilities rather than vague language.	3 / 3
Completeness	Clearly answers both what ('Investigate stuck runs and execution failures by tracing Symphony and Codex logs with issue/session identifiers') and when ('use when runs stall, retry repeatedly, or fail unexpectedly') with explicit trigger conditions.	3 / 3
Trigger Term Quality	Includes strong natural trigger terms users would say: 'stuck runs', 'execution failures', 'stall', 'retry repeatedly', 'fail unexpectedly', 'Symphony', 'Codex logs'. These cover multiple natural phrasings a user might use when encountering these issues.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive with domain-specific terms like 'Symphony', 'Codex logs', 'issue/session identifiers', and specific failure modes ('stuck runs', 'retry repeatedly'). Unlikely to conflict with other skills due to the narrow, well-defined niche.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: openai/symphony
Commit: 4cbe3a9

Reviewed: 26 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.