tessleng/agent-insight-experiment

Scan a repository to surface actionable findings about agent performance. Analyzes source code, git history, GitHub data, agent logs, and agent context, then synthesizes cross-referenced findings with targeted actions informed by Tessl product awareness. Supports incremental multi-developer contributions and produces a self-contained HTML report.

Quality

88%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

name:: analyze-github-data
description:: Analyze a repository's GitHub data — pull requests, review comments, CI/CD history, and issues — to identify patterns that affect coding agent performance. Surfaces recurring review feedback, CI failure patterns, agent-authored PR characteristics, and collaboration friction. Use when running an insight scan's GitHub data analysis phase, auditing PR quality for agent contributions, investigating CI failures related to agent changes, or understanding what code review history reveals about agent performance in a repository.

Analyze GitHub Data for Agent Performance Insights

Name: tessleng/agent-insight-experiment
Rating: 70.98 (1 reviews)
Author: tessleng

Examine pull requests, review comments, CI/CD history, and issues to surface patterns in how agent-authored (and human-authored) changes are reviewed, tested, and integrated.

Scope: Focus on the gh CLI and GitHub API for PR, review, CI, and issue data.

Before You Start

Read the shared reference files:

APEX taxonomy — insight categories
Insight report schema — report structure

Resolving reference paths: The links above use relative paths (../../references/...) that work when this skill is read from its tile directory. If those paths do not resolve (e.g. when activated via a .claude/skills/ symlink), find the shared references at .tessl/tiles/*/agent-insight-experiment/references/ relative to the repository root.

Your report prefix is GH (e.g., GH-001).

Quick Start (Recommended)

Run the data collection script to gather all GitHub data in a single pass:

bash "$(dirname "$0")/scripts/github-data-collector.sh" --root "$(pwd)"

Resolving script path: The path above assumes this skill is read from its tile directory. If run via a .claude/skills/ symlink, locate the script at .tessl/tiles/*/agent-insight-experiment/skills/analyze-github-data/scripts/github-data-collector.sh relative to the repo root. Pass --out <path> to write to a file.

The script requires gh (authenticated) and jq. It outputs JSON containing: merged PRs (100), open PRs (30), agent-authored PR identification, review comments from 50 PRs (bulk-collected), CI run data with failure summaries, issues (100), PR iteration depth (top 20 by comment count), and PR size statistics. Read the output and proceed directly to insight generation — skip the manual collection steps below.

Checkpoint: If the script reports an error about gh authentication, check the error message:

sandboxed keychain / "cannot read the token from the macOS keychain" — gh is authenticated on your machine but the agent's sandbox can't reach the keychain. See Troubleshooting: sandboxed gh.
401 / unauthorized / bad credentials — the token is rejected by GitHub. Run gh auth login -h github.com.
Any other error — this can be gh not installed, not logged in, or a connectivity/API failure. Try gh auth login, then verify access to api.github.com.

Troubleshooting: sandboxed `gh`

Symptom: gh works fine in your normal terminal, but when the agent runs it you see errors like token invalid in keyring, failed to read keyring, or gh auth status reports the host as unauthenticated.

Cause: Cursor (and other agent shells) wrap Shell tool calls in the macOS Seatbelt sandbox. The sandbox denies access to the macOS keychain, which is where gh stores its OAuth token by default. The token is valid — gh just can't read it from inside the sandbox. Subagents inherit the same sandbox.

Fix (one-time, in your shell profile ~/.zshrc or ~/.bashrc):

export GH_TOKEN="$(gh auth token)"

Then restart Cursor / the agent so the env var propagates into the sandboxed shell. gh checks GH_TOKEN / GITHUB_TOKEN before hitting the keyring, so every sandboxed gh call just works.

Notes:

If you rotate the token with gh auth refresh, re-run the export so the new token is picked up.
Alternatively, generate a fine-grained PAT on GitHub and export GH_TOKEN=ghp_... directly — that skips the keychain entirely.
This step is only needed once per machine; it is not needed on Linux or in CI where gh already uses env-based tokens.

Manual Collection (Fallback)

If the script is unavailable, verify access and collect data manually. Use a real API probe (not gh auth status, which also fails inside the sandbox):

gh api user --jq .login 2>/tmp/gh-preflight.err || {
  if grep -qiE 'keyring|token.*invalid|secret storage' /tmp/gh-preflight.err; then
    echo "Sandboxed keychain — see Troubleshooting: sandboxed gh section above."
  elif grep -qiE '401|unauthori[sz]ed|bad credentials' /tmp/gh-preflight.err; then
    echo "Token rejected — run: gh auth login -h github.com"
  else
    cat /tmp/gh-preflight.err
  fi
  exit 1
}
REPO=$(gh repo view --json nameWithOwner -q '.nameWithOwner')
echo "Analyzing: $REPO"

Checkpoint: If the API probe fails, stop and report the specific failure mode above — GitHub data analysis requires a working gh API call, not just an authenticated keychain.

Step 1: PR Overview

gh pr list --state merged --limit 100 --json number,title,author,createdAt,mergedAt,additions,deletions,changedFiles,reviewDecision,labels
gh pr list --state open --limit 30 --json number,title,author,createdAt,additions,deletions,changedFiles,labels

Note: average size, time-to-merge, common authors, labeling conventions.

Step 2: Identify Agent-Authored PRs

# Filter by author patterns and labels
gh pr list --state all --limit 200 --json number,title,author,labels | \
  jq '[.[] | select(
    (.author.login | test("bot|ai|copilot|agent|automated"; "i")) or
    (.labels[]?.name | test("ai|agent|copilot|generated|automated"; "i"))
  )]'

# Also search PR bodies for tool mentions
gh search prs --repo "$REPO" --limit 50 "co-authored" OR "generated by" OR "cursor" OR "claude" OR "copilot"

If agent PRs are identifiable, compare against human PRs on: review comments per PR, CI pass rate, iterations before merge.

Step 3: Review Comment Analysis

Extract review comments from recent PRs — this is the highest-signal data:

# Get review comments from last 50 merged PRs
for pr in $(gh pr list --state merged --limit 50 --json number -q '.[].number'); do
  gh api "repos/$REPO/pulls/$pr/comments" \
    --jq '.[] | {pr_number: .pull_request_url | split("/") | last, body: .body, path: .path, created: .created_at}' \
    2>/dev/null
done

Checkpoint: If the API returns errors or empty results, try fewer PRs. If the repo has no review comments, note this as a finding and move on.

Categorize comments by theme:

Pattern violations ("use X instead of Y", "we don't do it this way")
Missing co-changes ("this also needs to update X")
Naming/style ("rename this to match our convention")
Logic errors (bugs, edge cases)
Security/performance concerns

High-frequency themes map directly to insights.

Step 4: CI Failure Analysis

gh run list --limit 100 --json databaseId,name,conclusion,headBranch,createdAt

# Count failures by workflow
gh run list --limit 100 --json name,conclusion | \
  jq 'group_by(.name) | map({name: .[0].name, total: length, failed: [.[] | select(.conclusion == "failure")] | length}) | sort_by(-.failed)'

# Drill into frequent failures
FAILED_RUN=$(gh run list --limit 20 --status failure --json databaseId -q '.[0].databaseId')
if [ -n "$FAILED_RUN" ]; then
  gh run view "$FAILED_RUN" --log-failed 2>/dev/null | tail -50
fi

Look for: flaky tests, environment-specific failures, failures concentrated in specific areas.

Step 5: Issue Analysis

gh issue list --state all --limit 100 --json number,title,labels,createdAt,closedAt | \
  jq '[.[] | {number, title, labels: [.labels[].name], created: .createdAt}]'

Look for issues about code quality, consistency, documentation gaps, or recurring bugs.

Step 6: PR Iteration Depth

Find PRs that required many review rounds — these reveal hard-to-get-right areas:

for pr in $(gh pr list --state merged --limit 100 --json number -q '.[].number'); do
  count=$(gh api "repos/$REPO/pulls/$pr/comments" --jq 'length' 2>/dev/null)
  [ "$count" -gt 0 ] && echo "$count $pr"
done | sort -rn | head -20

For the top 5 high-iteration PRs, read the full comment threads to understand what caused the back-and-forth.

Scope Limits

Examine up to 100 merged PRs and 30 open PRs
Read review comments from up to 50 PRs
Check 100 CI runs
Review up to 100 issues
Use --limit flags to bound all API calls; respect rate limits

Output

Save to the orchestrator-provided path, or .tessl-insights-poc/reports/github-data.json standalone.

Validation before saving:

Verify report has all required metadata fields
Confirm at least 8 insights with evidence
Check every insight has real PR numbers or URLs as evidence, not placeholders

Set scope.metrics: prs_reviewed, review_comments_read, ci_runs_examined, issues_checked.

For evidence, use PR numbers and direct quotes from review comments — these are high-signal because they represent real human feedback. Mark data_source_exclusive: true for insights from review comments or CI patterns not visible from code alone.