Triage Antithesis test reports to understand what happened in a run: look up runs, check status, investigate failed properties (assertions), view metadata, download logs, inspect findings, and examine environmental details. Load after a run completes or when investigating a failure.
75
92%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Use this skill to analyze Antithesis test runs.
Reference files: This skill's references/ directory contains detailed guides for specific tasks. Do NOT read them all up front — only read a reference file when you are told to. Each reference file is mentioned by name at the point where it is needed.
snouty is not installed. See https://raw.githubusercontent.com/antithesishq/snouty/refs/heads/main/README.md for installation options.snouty is not at least version 0.5.0. Use snouty --version to find the version.jq is not installed. See https://jqlang.org/download/ for installation options.snouty runs may be hidden from snouty --help but is still supported. If you need to confirm availability or see subcommand syntax, snouty runs --help still works.This version of the triage skill talks to Antithesis through the snouty API (snouty runs ...). Several things can prevent that from working. Walk down the checklist below in order — it is ordered from the most-likely cause (auth not exported) to the least-likely (tenant missing API access). Stop at the first failing check and surface only that cause to the user; do not list the others. Do not attempt real triage work until every check passes.
You can skip preflight if you already know (from context or memory) that the user's setup is working.
Is $ANTITHESIS_API_KEY set? This is the most common cause of failure. If unset, tell the user to export it (e.g., export ANTITHESIS_API_KEY=<their key>) and stop. Do not speculate about other causes.
Does snouty doctor pass? Run snouty doctor. If it reports any failure, relay that failure to the user and stop. Today this mostly re-confirms environment variables, but it is the canonical place where key validity, network reachability, and API-access checks will be surfaced as they are added — so always run it here.
Can the API actually be reached? Probe with snouty runs list -n 1. If this errors, the most likely remaining cause is that the tenant is not yet enrolled in the snouty API rollout. Tell the user to either:
agent-browser-triage branch:
https://github.com/antithesishq/antithesis-skills/tree/agent-browser-triageThis fallback is temporary and will be removed once the rollout completes.
Before starting, collect the following from the user:
Tenant Name (required) — You must know the tenant name. Check the $ANTITHESIS_TENANT environment variable. Ask the user if you do not have evidence for the tenant name.
What they want to know — Are they interested in all failures in a specific run? Are they investigating a specific failure? Are they getting a general overview? Comparing runs? This determines which workflow to follow.
Your main method to obtain information is to use the snouty runs <OPTION> command with the --json option. The --json option returns line-delimited JSON. The fields in the JSON depend on the option you are using. The same command without --json returns fewer fields in a tabular form better suited for human consumption.
You will need to know the RUN_ID. Read references/run-discovery.md to learn how to obtain the run_id.
Read references/run-discovery.md to get a list of recent runs. Then summarize them in a report.
To look up a specific run (report), read references/run-info.md. Then continue with other workflows as needed.
If the run has a status of "incomplete", refer to the Diagnose incomplete run section below.
references/run-info.md to load information on a runlinks.triage_report is null/absent in the run-info output, no triage report was generated for this run (typical for cancelled runs and some unknown/starting states). Report that the run is not triageable — the properties and logs endpoints will return 404 for these runs.references/properties.md to load propertiesRead references/properties.md - use snouty runs --json properties to extract properties with their examples and learn how to download logs
Read references/logs.md to learn how to understand logs
For each property to investigate:
a. Pick the first failing example
b. Find the moment. If the counterexample has no moment field (telemetry / meta properties — see references/properties.md), report the counterexample value as the evidence and skip steps c–e.
c. Download the example's log using snouty runs --json logs $RUN_ID $INPUT_HASH $VTIME. Make sure vtime does not get rounded. input_hash and vtime should match exactly what is contained in the example's moment structure.
d. Analyze the downloaded log locally
e. If you aren't certain what caused the issue, consider downloading logs from other counterexamples and examples for the same property. Compare each occurrence and try to see if there are any similarities or differences that might explain the failure cause. Logs from passing examples can be useful to compare against to find differences between success and failure cases.
When searching for additional logs for property failures, first use:
snouty runs --json events ${RUN_ID} ${PROPERTY_NAME}This returns SOME but not necessarily ALL cases of the property passing or failing in the run. PROPERTY_NAME should match the "name" field
in the property data you are investigating. Match the "hit" and "condition" fields against the examples or counterexamples you are trying to find more of. Note that it is likely the examples and counterexamples you already know about will be in the list returned. Check the moment of the property returned from snouty runs events against the moment in the examples or counterexamples you have already downloaded.
Important: Cross-reference the log against the source code of the system under test (SUT) and the workload if you have access to it.
Deeply investigate the failure to develop an understanding of the timeline of events which led up to and potentially caused it.
Report your findings.
Important: The property status and assertion text alone are not sufficient — the logs provide the actual runtime context needed to understand the failure.
If the "status" of a specific run is "incomplete", there may be an error log to examine. The steps to triage an "incomplete" status run:
snouty runs --json show ${RUN_ID}failure_moment structure in the returned JSON. If present, use the input_hash and vtime from failure_moment to download a log in accordance with the instructions in references/logs.md.snouty runs --json build-logs ${RUN_ID} and look for errors in the build.Note on
links.triage_report: For incomplete runs, the per-propertypropertiesandlogsendpoints typically return 404, butlinks.triage_reportandfailure_momentmay still be populated inshow. Report whatshowactually contains — do not claim the triage_report link is absent unless that field is null. The triage workflow for incomplete runs isfailure_moment+build-logs, regardless of whether a report URL exists.
End your triage summary with a short "next steps" section that names follow-up skills or additional triage workflows when they would help. Treat these as suggestions for the user, not actions to take automatically — let the user (or their orchestration) decide when to invoke them. If the user has explicitly asked to chain skills (e.g. "run triage and then debug any failing property"), follow their instructions and proceed with the chain.
Skills worth suggesting based on what triage uncovers:
antithesis-research — map the codebase to surface better testable properties. Useful when failures point to systemic gaps in coverage rather than a discrete bug.antithesis-workload — extend or refine assertions, test commands, and logs. Should be suggested whenever all properties are passing to incrementally improve the workload, or when properties are not reachable due to the workload being underpowered. This skill can also be used to add more logging or fix properties to assist with root cause analysis.antithesis-debug — open the multiverse debugger on a specific failure to inspect container state at the failing moment or explore alternate histories. Useful when log analysis alone hasn't explained the failure. Requires agent-browser and may require interactive login in a browser.antithesis-query-logs — search across all timelines for ordering and causality questions ("did event A always precede failure B?", "do failures still occur without the preceding fault?"). Useful when a failure looks correlated with another event or fault, or when you've only inspected one history out of many. Requires agent-browser and may require interactive login in a browser.Include the data the next skill will need (run_id, property name, moment, etc.) so the user can invoke it without re-discovering context.
Before declaring this skill complete, review your work against the criteria below. This skill's output is conversational (summaries, tables, analysis), so the review should happen in your current context. Re-read the guidance in this file, then systematically check each item below against the answers and analysis you produced.
Review criteria:
f837248
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.