CtrlK
BlogDocsLog inGet started
Tessl Logo

antithesis-triage

Triage Antithesis test reports to understand what happened in a run: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details. Load after a run completes or when investigating a failure.

88

Quality

85%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

SKILL.md
Quality
Evals
Security

Quality

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates specific capabilities and provides explicit trigger conditions. The main weakness is in trigger term coverage — while 'Antithesis' provides distinctiveness, users might describe their needs using more general testing/debugging vocabulary that isn't fully captured. Overall, it effectively communicates both what the skill does and when to use it.

Suggestions

Add more natural trigger terms users might say, such as 'test failure', 'test results', 'debug test', 'why did the test fail', or 'test run analysis' to improve discoverability.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details.

3 / 3

Completeness

Clearly answers both what ('Triage Antithesis test reports... look up runs, check status, investigate failed properties...') and when ('Load after a run completes or when investigating a failure') with explicit trigger guidance.

3 / 3

Trigger Term Quality

Includes some relevant terms like 'test reports', 'run', 'failed properties', 'logs', 'findings', but 'Antithesis' is a specific product name that limits natural keyword coverage. Missing common variations users might say like 'test failure', 'test results', 'debug', 'why did the test fail'.

2 / 3

Distinctiveness Conflict Risk

Very distinct niche — specifically targets Antithesis test report triage, which is a specific product/tool. The combination of 'Antithesis' with test report investigation actions makes it highly unlikely to conflict with other skills.

3 / 3

Total

11

/

12

Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill that provides clear executable commands, well-sequenced workflows with validation checkpoints, and excellent progressive disclosure through reference files. The main weakness is moderate verbosity — some instructions are repeated across sections (parallel execution warning, cascade sample size warning), and a few sections could be tightened. Overall it's a strong skill that effectively guides Claude through a complex browser-based triage workflow.

DimensionReasoningScore

Conciseness

The skill is fairly long but most content earns its place — session management, navigation patterns, and workflow steps are all necessary. However, there's some repetition (e.g., 'never run agent-browser calls in parallel' appears twice, the cascade verification section restates 'do not generalize from a small sample' multiple times, and some general guidance bullets repeat earlier instructions).

2 / 3

Actionability

The skill provides concrete, executable shell commands throughout — session creation, runtime injection, method calls, wait patterns, and error handling all have copy-paste ready code. The workflows give specific method names and step-by-step sequences with exact function signatures.

3 / 3

Workflow Clarity

Multi-step workflows are clearly sequenced with explicit validation checkpoints: authentication checks after navigation, waitForReady() with error checking, runtime injection verification with retry logic, and a self-review checklist at the end. The navigation-and-loading pattern is a well-defined feedback loop (wait → check URL → inject → wait for ready → check error).

3 / 3

Progressive Disclosure

The skill is an excellent overview that delegates detailed tasks to clearly-named reference files (setup-auth.md, run-discovery.md, properties.md, logs.md, error-reports.md) at exactly the points where they're needed. The explicit instruction 'Do NOT read them all up front — only read a reference file when you are told to' is ideal progressive disclosure. References are one level deep and well-signaled.

3 / 3

Total

11

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
antithesishq/antithesis-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.