CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/troubleshoot

Search-first troubleshooting with a diagnostic phase — use when an error, bug, or unexpected behaviour is reported.

75

Quality

75%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

Quality

Discovery

72%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description excels at trigger term coverage and completeness by clearly stating when to use the skill and providing explicit trigger keywords. However, it is weak on specificity—it describes a methodology ('search-first troubleshooting with diagnostic phase') rather than concrete actions the skill performs. Adding specific capabilities would make this significantly stronger.

Suggestions

Add specific concrete actions the skill performs, e.g., 'Analyzes error messages, searches codebase for root causes, checks logs, suggests fixes, and verifies solutions.'

Clarify what 'search-first troubleshooting with diagnostic phase' means in practice—list the diagnostic steps or outputs the skill provides.

DimensionReasoningScore

Specificity

The description does not list any concrete actions or capabilities. It mentions 'search-first troubleshooting with diagnostic phase' which is a vague methodology description rather than specific actions like 'analyze stack traces, check logs, identify root causes'.

1 / 3

Completeness

It explicitly answers both 'what' (search-first troubleshooting with diagnostic phase) and 'when' (when user reports an error, bug, or something not working), with explicit trigger terms listed. The 'what' is thin but present, and the 'when' is clearly stated.

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms users would actually say: 'debug, error, broken, not working, failing, crash, exception'. These are highly natural and cover common variations of how users report problems.

3 / 3

Distinctiveness Conflict Risk

The debugging/troubleshooting domain is fairly broad and could overlap with language-specific debugging skills, testing skills, or code review skills. However, the explicit trigger terms help narrow it somewhat.

2 / 3

Total

9

/

12

Passed

Implementation

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured troubleshooting skill with excellent workflow clarity and progressive disclosure. Its main weaknesses are verbosity in the anti-patterns/philosophy/when-to-use sections (which explain reasoning Claude already possesses) and a lack of concrete, executable examples — the usage examples are pseudocode comments rather than actionable demonstrations. The duplicated references section also wastes tokens.

Suggestions

Remove or significantly condense the 'When to Use', 'When Not to Use', 'Anti-Patterns', and 'Philosophy' sections — these explain reasoning Claude already understands and consume significant tokens.

Replace the pseudocode usage examples with concrete, executable examples showing actual commands, search queries, or diagnostic outputs that Claude would produce.

Consolidate the duplicated 'Refs' and 'References' sections into a single section to save tokens.

DimensionReasoningScore

Conciseness

The skill is mostly efficient with good use of shorthand and structured steps, but includes some redundancy: the 'When to Use'/'When Not to Use'/'Anti-Patterns' sections are verbose and explain reasoning Claude already understands. The Philosophy section and Anti-Patterns explanations with 'Why:' annotations add bulk. The Refs section is duplicated (listed twice with slightly different formatting).

2 / 3

Actionability

The workflow steps provide structured guidance but lack concrete executable examples. The 'Usage Examples' section shows only comments/pseudocode rather than actual commands or code. Key steps like 'WebSearch: [error] [stack] [framework]' are templates rather than executable instructions. The AskUserQuestion guard is concrete and actionable, but most diagnostic steps remain abstract.

2 / 3

Workflow Clarity

The workflow is clearly sequenced (0-6) with explicit validation checkpoints: search before diagnosing, reproduce before theorizing, confirm root cause before persisting, and the OODA loop has clear exit conditions. The completion checklist and mandatory persist step with preconditions demonstrate good feedback loops. The AskUserQuestion guard adds an important error-recovery mechanism.

3 / 3

Progressive Disclosure

The skill provides a clear overview with well-signaled one-level-deep references to diagnose.md, search-multi-source.md, and reference.md. Step 3 explicitly defers to 'references/protocols/diagnose.md for details.' The References section clearly describes what each linked file contains. Content is appropriately split between the overview and detailed reference files.

3 / 3

Total

10

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Reviewed

Table of Contents