CtrlK
BlogDocsLog inGet started
Tessl Logo

try-tessl/agent-quality

Analyze agent sessions against verifier checklists, detect friction points, and create structured verifiers from skills and docs. Produces per-session verdicts and aggregated quality reports.

88

2.93x
Quality

86%

Does it follow best practices?

Impact

97%

2.93x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-2/

Automate the Session Analysis Workflow

Problem Description

A platform engineering team runs regular session analyses to understand how well their AI agents are following internal guidelines. They have the agent-quality tile installed in their project and want to make the analysis process repeatable. Today, running an analysis requires knowing the right sequence of commands, which scripts to call, and which flags to use — knowledge that's inconsistently applied across team members.

The team lead wants two things: a reusable shell script that any team member can run to kick off a Phase 1 compliance check on recent sessions, and a separate script for when someone has a specific concern to investigate (e.g. "check how agents handle authentication changes"). The scripts should be robust and follow the recommended patterns for using the analysis pipeline.

Output Specification

Produce two shell scripts:

  1. analyze-recent.sh — Runs a standard Phase 1 analysis on the most recent sessions for the current project directory. Should handle the case where the running session itself might be captured.

  2. analyze-search.sh — Takes a search query as an argument ($1) and first searches for matching sessions, then analyzes a small number of them. Should demonstrate the recommended workflow for investigating a specific concern.

Both scripts should be executable and include brief comments explaining what each section does. Also produce a README.md explaining when to use each script.

evals

README.md

tile.json