antithesis-triage

Triage Antithesis test reports to understand what happened in a run: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details. Load after a run completes or when investigating a failure.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Content

100%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is a well-structured, actionable triage guide: concrete snouty/jq commands, clearly sequenced workflows with validation gates and error recovery, and clean one-level-deep progressive disclosure into real reference files. It assumes Claude's competence and avoids concept over-explanation.

Dimension	Reasoning	Score
Conciseness	The body is lean and action-dense — concrete commands and workflow steps with no explanation of concepts Claude already knows; the few repetitions (e.g. null triage_report handling) serve distinct workflows and earn their place.	3 / 3
Actionability	Provides copy-paste-ready executable commands throughout — 'snouty doctor --json', 'snouty runs --json show ${RUN_ID}', 'snouty runs --json events ${RUN_ID} ${PROPERTY_NAME}' plus a jq pipeline — with real variable names and exact field paths.	3 / 3
Workflow Clarity	Multiple workflows are clearly sequenced with numbered steps and explicit validation checkpoints — the Preflight 'Proceed only when ok is true', the null-report stop in Triage step 2, and a final Self-Review checklist — plus error-recovery (retry with larger -n).	3 / 3
Progressive Disclosure	SKILL.md is an overview that signals each reference at the point of need, tells Claude not to load them all up front, and the four cited files (logs.md, properties.md, run-discovery.md, run-info.md) all exist and stay one level deep (properties.md→logs.md is a peer reference, not nested).	3 / 3
	Total	12 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is concise yet specific: it lists concrete triage actions, includes an explicit 'Load when' trigger clause, and is anchored to a distinct Antithesis niche. It does not pad with buzzwords or over-claims.

Dimension	Reasoning	Score
Specificity	Enumerates multiple concrete actions — 'look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details' — matching the multiple-specific-actions anchor.	3 / 3
Completeness	The colon clause answers 'what' (the enumerated actions) and 'Load after a run completes or when investigating a failure' is an explicit 'when' trigger, satisfying both halves.	3 / 3
Trigger Term Quality	Includes natural triage phrases a user would actually say — 'investigating a failure', 'what happened in a run', 'test reports', 'failed properties', 'logs' — giving good coverage of natural trigger terms.	3 / 3
Distinctiveness Conflict Risk	Scoped to 'Antithesis test reports' and 'failed properties (assertions)', a clear niche with triggers unlikely to fire for unrelated skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: antithesishq/antithesis-skills
Commit: 9b75328

Reviewed: about 9 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.