CtrlK
BlogDocsLog inGet started
Tessl Logo

antithesis-triage

Triage Antithesis test reports to understand what happened in a run: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details. Load after a run completes or when investigating a failure.

75

Quality

92%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly identifies the tool (Antithesis), lists specific concrete actions, and provides explicit trigger conditions. It uses proper third-person voice and covers both the 'what' and 'when' dimensions effectively, making it easy for Claude to select this skill appropriately from a large pool.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details.

3 / 3

Completeness

Clearly answers both what ('Triage Antithesis test reports... look up runs, check status, investigate failed properties...') and when ('Load after a run completes or when investigating a failure'), with explicit trigger guidance.

3 / 3

Trigger Term Quality

Includes natural keywords users would say: 'test reports', 'run', 'failed properties', 'assertions', 'logs', 'findings', 'failure', 'triage', 'Antithesis'. These cover the domain well and match how users would describe investigating test failures.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive due to the specific product name 'Antithesis' and the focused domain of test report triage. Unlikely to conflict with generic testing or logging skills due to the specific tool reference and detailed action list.

3 / 3

Total

12

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, well-structured triage skill with excellent workflow clarity and progressive disclosure. The actionability is high with concrete commands and specific field names throughout. The main weakness is moderate verbosity — the follow-up skills section and some repeated guidance about log importance could be tightened to save tokens.

DimensionReasoningScore

Conciseness

The skill is mostly efficient and avoids explaining concepts Claude already knows, but there's some verbosity — the preflight section is somewhat lengthy, the 'Suggesting follow-up skills' section includes detailed descriptions of four other skills that could be more concise, and some guidance is repeated (e.g., the importance of logs is stated multiple times).

2 / 3

Actionability

The skill provides concrete, executable commands throughout (e.g., `snouty runs --json properties`, `snouty runs --json events ${RUN_ID} ${PROPERTY_NAME}`, `snouty runs --json build-logs ${RUN_ID}`). Steps are specific with exact flags, field names to check, and clear decision points. The guidance is copy-paste ready and leaves little ambiguity about what to do.

3 / 3

Workflow Clarity

Multiple workflows are clearly sequenced with explicit validation checkpoints — the preflight checklist stops at the first failure, the triage workflow checks for null triage_report before proceeding, incomplete runs have a distinct diagnostic path, and the 'Investigate failed properties' workflow includes feedback loops (downloading additional logs, comparing passing vs failing examples). The self-review section adds a final validation checkpoint.

3 / 3

Progressive Disclosure

The skill is well-structured as an overview that delegates detailed content to reference files (run-discovery.md, run-info.md, properties.md, logs.md) at exactly the point they're needed. References are one level deep, clearly signaled, and the skill explicitly instructs not to read them all up front. Navigation is intuitive.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
antithesishq/antithesis-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.