CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/probe

Run a safe-to-fail experiment for Complex domain problems where cause-and-effect is only visible in retrospect.

88

Quality

88%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-structured skill description that excels in completeness and distinctiveness. It clearly defines its niche within a Cynefin-style framework, includes explicit trigger terms and boundary exclusions, and follows the 'Use when / NOT for' pattern effectively. The main weakness is that the specific actions described are somewhat abstract and domain-jargon-heavy ('foreground qualify → background probe → sense result'), which may not be immediately clear to all users.

DimensionReasoningScore

Specificity

The description names the domain (Complex domain problems) and describes a two-phase process (foreground qualify → background probe → sense result), but the concrete actions are somewhat abstract and framework-specific rather than listing multiple clearly understandable actions.

2 / 3

Completeness

Clearly answers both 'what' (safe-to-fail experiment for Complex domain problems with a two-phase process) and 'when' (explicit 'Use when:' clause with trigger terms, plus 'NOT for' exclusions). The explicit trigger guidance and boundary conditions are well-defined.

3 / 3

Trigger Term Quality

Includes good natural trigger terms: 'probe', 'safe-to-fail', 'test hypothesis', 'experiment with hypothesis', 'Complex domain with hypothesis'. Also includes explicit negative triggers distinguishing from brainstorm and investigate skills, which helps with routing.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with clear niche (safe-to-fail experiments in Complex domain). The explicit 'NOT for' clauses referencing other skills (brainstorm, investigate) directly reduce conflict risk and make this clearly distinguishable.

3 / 3

Total

11

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted skill with excellent workflow clarity, strong actionability, and good progressive disclosure. The two-phase structure with explicit gates and handoff templates is particularly strong. The main weakness is moderate verbosity in the anti-patterns and when-to-use sections, which could be tightened without losing clarity.

Suggestions

Condense the anti-patterns section by removing the 'Why:' explanations — the anti-pattern statements are self-explanatory and Claude can infer the reasoning.

Consider moving 'When to Use' and 'When Not to Use' into a reference file since the skill description already covers routing logic, or compress them into a compact table.

DimensionReasoningScore

Conciseness

The skill is mostly efficient and avoids explaining basic concepts Claude would know, but the anti-patterns section is somewhat verbose with 'Why:' explanations that restate what's already obvious from the anti-pattern itself. The 'When to Use' and 'When Not to Use' sections also add bulk that could be trimmed since the description already covers routing.

2 / 3

Actionability

The skill provides highly concrete, step-by-step guidance with specific output formats, gate conditions, file naming conventions (with collision handling), token budgets, and a clear handoff table with template references. The usage examples include concrete commands and measurable criteria.

3 / 3

Workflow Clarity

The two-phase workflow is clearly sequenced with explicit entry and exit gates, validation checkpoints (Phase 1 must complete before Phase 2, confirm/refute criteria must be defined before execution), feedback loops (loop back to 1.1-1.4 on non-confirm), and a structured result classification table with clear transition paths.

3 / 3

Progressive Disclosure

Content is well-structured with a clear overview in the main file and one-level-deep references to specific handoff templates and reference material. The References section at the bottom provides clear navigation with descriptive labels for each linked file.

3 / 3

Total

11

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Reviewed

Table of Contents