Analyze agent sessions against verifier checklists, detect friction points, and create structured verifiers from skills and docs. Produces per-session verdicts and aggregated quality reports.
88
86%
Does it follow best practices?
Impact
97%
2.93xAverage score across 3 eval scenarios
Passed
No known issues
Understand how your agents are actually performing. Analyze session logs against structured verifiers, detect friction points, and create new verifiers from your skills and docs.
tessl install try-tessl/agent-qualityThe tile collects agent session logs from Claude Code, Codex, Gemini, and Cursor, normalizes them, and dispatches LLM judges to evaluate each session against verifier checklists you define. It also detects friction, such as moments where agents struggled, backtracked, or wasted time, and correlates those findings with verifier results.
Verifiers are structured pass/fail checklists that encode what "good" looks like for your agents. You can extract them from existing skills, CLAUDE.md/AGENTS.md rules, or write them from scratch.
| Skill | Description |
|---|---|
analyze-sessions | Run the analysis pipeline: collect logs, discover verifiers, dispatch judges, and produce per-session verdicts with aggregated reports |
create-verifiers | Create structured verifiers from skills, docs, rules, or any instruction source — produces checklist-based criteria that judges score against |
review-friction | Detect friction points in sessions — errors, backtracking, repeated failures — and classify by type and impact |
python3 3.9+ (standard library only)claude -p)