Analyze agent sessions against verifier checklists, detect friction points, and create structured verifiers from skills and docs. Produces per-session verdicts and aggregated quality reports.
88
86%
Does it follow best practices?
Impact
97%
2.93xAverage score across 3 eval scenarios
Passed
No known issues
A platform engineering team runs regular session analyses to understand how well their AI agents are following internal guidelines. They have the agent-quality tile installed in their project and want to make the analysis process repeatable. Today, running an analysis requires knowing the right sequence of commands, which scripts to call, and which flags to use — knowledge that's inconsistently applied across team members.
The team lead wants two things: a reusable shell script that any team member can run to kick off a Phase 1 compliance check on recent sessions, and a separate script for when someone has a specific concern to investigate (e.g. "check how agents handle authentication changes"). The scripts should be robust and follow the recommended patterns for using the analysis pipeline.
Produce two shell scripts:
analyze-recent.sh — Runs a standard Phase 1 analysis on the most recent sessions for the current project directory. Should handle the case where the running session itself might be captured.
analyze-search.sh — Takes a search query as an argument ($1) and first searches for matching sessions, then analyzes a small number of them. Should demonstrate the recommended workflow for investigating a specific concern.
Both scripts should be executable and include brief comments explaining what each section does. Also produce a README.md explaining when to use each script.