Scan a repository to surface actionable findings about agent performance. Analyzes source code, git history, GitHub data, agent logs, and agent context, then synthesizes cross-referenced findings with targeted actions informed by Tessl product awareness. Supports incremental multi-developer contributions and produces a self-contained HTML report.
70
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Examine agent conversation histories to find patterns where agents repeatedly struggle, users express frustration, or tasks require excessive iteration.
Scope: Focus on agent conversation logs and related metadata.
Read the shared reference files:
Resolving reference paths: The shared reference links above use relative paths (
../../references/...) that work when this skill is read from its tile directory. If those paths do not resolve (e.g. when activated via a.claude/skills/symlink), find the shared references at.tessl/tiles/*/agent-insight-experiment/references/relative to the repository root. The log discovery guide (./references/log-discovery.md) is local to this skill and should resolve via the symlink.
Your report prefix is LOG (e.g., LOG-001, LOG-002).
Agent logs live in different locations depending on the tool. The goal is to find conversation transcripts for the current project/repository.
If the try-tessl/agent-quality plugin is installed, use its collection and normalization scripts — they handle multi-tool log discovery and produce a consistent format:
# Check if audit-logs is installed
AUDIT_SCRIPTS="$(find "$(pwd)/.tessl/tiles" "$HOME/.tessl/tiles" -path "*/audit-logs/skills/audit-logs/scripts/collect_logs.py" -print -quit 2>/dev/null)"
if [ -n "$AUDIT_SCRIPTS" ]; then
SCRIPTS_DIR="$(dirname "$AUDIT_SCRIPTS")"
# Collect and normalize logs for this project
uv run python3 "$SCRIPTS_DIR/collect_logs.py" --project-dir "$(pwd)"
uv run python3 "$SCRIPTS_DIR/normalize_logs.py" --project-dir "$(pwd)"
fiIf audit-logs isn't available, discover logs directly. See the log discovery guide for detailed paths and formats. The key locations to check:
~/.cursor/projects/*/agent-transcripts/*.jsonl~/.claude/projects/*/conversations/ or ~/.claude/conversations/Match project directories by looking for path fragments that match the current repo name or path.
Before diving into analysis, inventory what's available:
# How many sessions exist for this project?
# What date range do they cover?
# Which tools were used (Cursor, Claude Code, other)?Report these numbers in your scope metrics.
If there are many sessions (>30), prioritize:
For smaller sets, analyze all available sessions.
Scan conversations for these signals:
| Signal | What to grep for |
|---|---|
| Agent errors | Tool call failures, error messages, exceptions in output |
| Correction cycles | User messages containing "no", "wrong", "undo", "revert", "try again" |
| Repeated attempts | Same tool call or file edit appearing 3+ times |
| Task abandonment | Conversations ending without clear completion |
| Wrong approach | User redirecting: "instead", "don't use", "use X not Y" |
For each failure pattern, note: task attempted, what went wrong, files/modules involved, whether it was eventually resolved.
| Frustration level | Indicators |
|---|---|
| Mild correction | "No, use X not Y", "That's not right" |
| Escalating | Shorter messages, more directive language, "just do X" |
| Repeated instruction | User explaining the same thing 2+ times in one session |
| Abandonment | Session ends abruptly mid-task |
| Manual takeover | "I'll do it myself", user switching to manual edits |
These are the highest-impact insights — real pain points observable nowhere else.
Group findings by:
Also note what goes well — areas where the agent completes tasks smoothly. This helps calibrate: if the agent handles service A perfectly but always struggles with service B, that's a strong comparative signal.
Agent logs are uniquely powerful for revealing:
Produce a JSON report conforming to the insight report schema.
If an output_file path is provided (e.g. by the orchestrator), save to that exact path. Do not compute your own filename — use the path as given.
If running standalone (no output_file provided), resolve the contributor filename (caller may pass username= to override):
# Explicit input wins over the git/whoami fallback.
USERNAME="${USERNAME_INPUT:-}"
if [ -z "$USERNAME" ]; then
USERNAME=$(git config user.name 2>/dev/null \
| tr '[:upper:]' '[:lower:]' \
| sed -E 's/[[:space:]]+/-/g; s/[^a-z0-9._-]//g; s/^[-.]+|[-.]+$//g')
fi
[ -z "$USERNAME" ] && USERNAME=$(whoami | tr '[:upper:]' '[:lower:]' | sed -E 's/[^a-z0-9._-]//g')
[ -z "$USERNAME" ] && USERNAME="unknown-user"
DATE=$(date +%Y%m%d)Default standalone output path: .tessl-insights-poc/reports/agent-logs/${USERNAME}-${DATE}.json.
Set scope.metrics to include:
sessions_parsed: total number of sessions analyzed (denominator for the sessions_with_frustration_signals hero stat)conversations_analyzed: total conversations (some sessions may have multiple)tool_calls_examined: approximate number of tool calls reviewedtools_represented: array of agent-harness slugs present in the analyzed sessions. Use lowercase, underscore-separated slugs so downstream tools can consume them directly — "claude_code", "cursor", "copilot", "gemini_cli", etc. Include every harness that contributed at least one session (e.g. ["claude_code", "cursor"] if both are mixed). This value is propagated through to the synthesized report as data_sources_used[].tools.sessions_with_frustration_signals: number of distinct parsed sessions exhibiting at least one frustration or correction signal. Definition: a session counts if it contains any of — a user message with clear frustration language (e.g. "wrong", "no", "stop guessing", "ugh", "that's not", "undo", "revert"), a manual takeover ("I'll do it myself"), a same-instruction-repeated-2+-times pattern, or the agent redirected by the user more than once in the session. This is a session-level count (not a per-signal count). Use conservative judgement — borderline cases (a single polite "no, do X instead") do not count. The denominator is sessions_parsed; do not emit a ratio. Always emit this field, even if the count is 0.Validation before saving:
metadata fieldsscope.metrics.sessions_with_frustration_signals ≤ scope.metrics.sessions_parsed and that every session counted towards it is referenced by at least one RAF-category insight in this reportMark data_source_exclusive: true for insights requiring actual behavioral observation (frustration signals, iteration patterns, tool misuse).