Scan a repository to surface actionable findings about agent performance. Analyzes source code, git history, GitHub data, agent logs, and agent context, then synthesizes cross-referenced findings with targeted actions informed by Tessl product awareness. Supports incremental multi-developer contributions and produces a self-contained HTML report.
70
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Examine pull requests, review comments, CI/CD history, and issues to surface patterns in how agent-authored (and human-authored) changes are reviewed, tested, and integrated.
Scope: Focus on the gh CLI and GitHub API for PR, review, CI, and issue data.
Read the shared reference files:
Resolving reference paths: The links above use relative paths (
../../references/...) that work when this skill is read from its tile directory. If those paths do not resolve (e.g. when activated via a.claude/skills/symlink), find the shared references at.tessl/tiles/*/agent-insight-experiment/references/relative to the repository root.
Your report prefix is GH (e.g., GH-001).
Run the data collection script to gather all GitHub data in a single pass:
bash "$(dirname "$0")/scripts/github-data-collector.sh" --root "$(pwd)"Resolving script path: The path above assumes this skill is read from its tile directory. If run via a
.claude/skills/symlink, locate the script at.tessl/tiles/*/agent-insight-experiment/skills/analyze-github-data/scripts/github-data-collector.shrelative to the repo root. Pass--out <path>to write to a file.
The script requires gh (authenticated) and jq. It outputs JSON containing: merged PRs (100), open PRs (30), agent-authored PR identification, review comments from 50 PRs (bulk-collected), CI run data with failure summaries, issues (100), PR iteration depth (top 20 by comment count), and PR size statistics. Read the output and proceed directly to insight generation — skip the manual collection steps below.
Checkpoint: If the script reports an error about gh authentication, check the error message:
sandboxed keychain / "cannot read the token from the macOS keychain" — gh is authenticated on your machine but the agent's sandbox can't reach the keychain. See Troubleshooting: sandboxed gh.401 / unauthorized / bad credentials — the token is rejected by GitHub. Run gh auth login -h github.com.gh not installed, not logged in, or a connectivity/API failure. Try gh auth login, then verify access to api.github.com.ghSymptom: gh works fine in your normal terminal, but when the agent runs it you see errors like token invalid in keyring, failed to read keyring, or gh auth status reports the host as unauthenticated.
Cause: Cursor (and other agent shells) wrap Shell tool calls in the macOS Seatbelt sandbox. The sandbox denies access to the macOS keychain, which is where gh stores its OAuth token by default. The token is valid — gh just can't read it from inside the sandbox. Subagents inherit the same sandbox.
Fix (one-time, in your shell profile ~/.zshrc or ~/.bashrc):
export GH_TOKEN="$(gh auth token)"Then restart Cursor / the agent so the env var propagates into the sandboxed shell. gh checks GH_TOKEN / GITHUB_TOKEN before hitting the keyring, so every sandboxed gh call just works.
Notes:
gh auth refresh, re-run the export so the new token is picked up.export GH_TOKEN=ghp_... directly — that skips the keychain entirely.gh already uses env-based tokens.If the script is unavailable, verify access and collect data manually. Use a real API probe (not gh auth status, which also fails inside the sandbox):
gh api user --jq .login 2>/tmp/gh-preflight.err || {
if grep -qiE 'keyring|token.*invalid|secret storage' /tmp/gh-preflight.err; then
echo "Sandboxed keychain — see Troubleshooting: sandboxed gh section above."
elif grep -qiE '401|unauthori[sz]ed|bad credentials' /tmp/gh-preflight.err; then
echo "Token rejected — run: gh auth login -h github.com"
else
cat /tmp/gh-preflight.err
fi
exit 1
}
REPO=$(gh repo view --json nameWithOwner -q '.nameWithOwner')
echo "Analyzing: $REPO"Checkpoint: If the API probe fails, stop and report the specific failure mode above — GitHub data analysis requires a working gh API call, not just an authenticated keychain.
gh pr list --state merged --limit 100 --json number,title,author,createdAt,mergedAt,additions,deletions,changedFiles,reviewDecision,labels
gh pr list --state open --limit 30 --json number,title,author,createdAt,additions,deletions,changedFiles,labelsNote: average size, time-to-merge, common authors, labeling conventions.
# Filter by author patterns and labels
gh pr list --state all --limit 200 --json number,title,author,labels | \
jq '[.[] | select(
(.author.login | test("bot|ai|copilot|agent|automated"; "i")) or
(.labels[]?.name | test("ai|agent|copilot|generated|automated"; "i"))
)]'
# Also search PR bodies for tool mentions
gh search prs --repo "$REPO" --limit 50 "co-authored" OR "generated by" OR "cursor" OR "claude" OR "copilot"If agent PRs are identifiable, compare against human PRs on: review comments per PR, CI pass rate, iterations before merge.
Extract review comments from recent PRs — this is the highest-signal data:
# Get review comments from last 50 merged PRs
for pr in $(gh pr list --state merged --limit 50 --json number -q '.[].number'); do
gh api "repos/$REPO/pulls/$pr/comments" \
--jq '.[] | {pr_number: .pull_request_url | split("/") | last, body: .body, path: .path, created: .created_at}' \
2>/dev/null
doneCheckpoint: If the API returns errors or empty results, try fewer PRs. If the repo has no review comments, note this as a finding and move on.
Categorize comments by theme:
High-frequency themes map directly to insights.
gh run list --limit 100 --json databaseId,name,conclusion,headBranch,createdAt
# Count failures by workflow
gh run list --limit 100 --json name,conclusion | \
jq 'group_by(.name) | map({name: .[0].name, total: length, failed: [.[] | select(.conclusion == "failure")] | length}) | sort_by(-.failed)'
# Drill into frequent failures
FAILED_RUN=$(gh run list --limit 20 --status failure --json databaseId -q '.[0].databaseId')
if [ -n "$FAILED_RUN" ]; then
gh run view "$FAILED_RUN" --log-failed 2>/dev/null | tail -50
fiLook for: flaky tests, environment-specific failures, failures concentrated in specific areas.
gh issue list --state all --limit 100 --json number,title,labels,createdAt,closedAt | \
jq '[.[] | {number, title, labels: [.labels[].name], created: .createdAt}]'Look for issues about code quality, consistency, documentation gaps, or recurring bugs.
Find PRs that required many review rounds — these reveal hard-to-get-right areas:
for pr in $(gh pr list --state merged --limit 100 --json number -q '.[].number'); do
count=$(gh api "repos/$REPO/pulls/$pr/comments" --jq 'length' 2>/dev/null)
[ "$count" -gt 0 ] && echo "$count $pr"
done | sort -rn | head -20For the top 5 high-iteration PRs, read the full comment threads to understand what caused the back-and-forth.
--limit flags to bound all API calls; respect rate limitsSave to the orchestrator-provided path, or .tessl-insights-poc/reports/github-data.json standalone.
Validation before saving:
metadata fieldsSet scope.metrics: prs_reviewed, review_comments_read, ci_runs_examined, issues_checked.
For evidence, use PR numbers and direct quotes from review comments — these are high-signal because they represent real human feedback. Mark data_source_exclusive: true for insights from review comments or CI patterns not visible from code alone.