CtrlK
BlogDocsLog inGet started
Tessl Logo

tessleng/agent-insight-experiment

Scan a repository to surface actionable findings about agent performance. Analyzes source code, git history, GitHub data, agent logs, and agent context, then synthesizes cross-referenced findings with targeted actions informed by Tessl product awareness. Supports incremental multi-developer contributions and produces a self-contained HTML report.

70

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

SKILL.mdskills/analyze-git-history/

name:
analyze-git-history
description:
Analyze a repository's git history to identify patterns that affect coding agent performance. Examines file churn, co-change relationships, revert frequency, contributor concentration, and commit patterns to find areas where agents are likely to struggle. Use when running an insight scan's git history analysis phase, analyzing repository churn for agent readiness, auditing change patterns for AI agent risks, or understanding what a repo's change history reveals about agent performance risks.

Analyze Git History for Agent Performance Insights

Examine the repository's git history to surface patterns that indicate where coding agents are likely to struggle.

Scope: Focus on git commands (log, shortlog, diff, blame, show, rev-list, etc.) for temporal and change pattern analysis.

Before You Start

Read the shared reference files:

  • Read the APEX taxonomy for the insight categories
  • Read the insight report schema for the exact report structure

Resolving reference paths: The links above use relative paths (../../references/...) that work when this skill is read from its tile directory. If those paths do not resolve (e.g. when activated via a .claude/skills/ symlink), find the shared references at .tessl/tiles/*/agent-insight-experiment/references/ relative to the repository root.

Your report prefix is GIT (e.g., GIT-001, GIT-002).

Quick Start (Recommended)

Run the data collection script to gather all git metrics in a single pass:

bash "$(dirname "$0")/scripts/git-data-collector.sh" --root "$(pwd)"

Resolving script path: The path above assumes this skill is read from its tile directory. If run via a .claude/skills/ symlink, locate the script at .tessl/tiles/*/agent-insight-experiment/skills/analyze-git-history/scripts/git-data-collector.sh relative to the repo root. Pass --months <n> to adjust the analysis window (default: 6), or --out <path> to write to a file.

The script outputs JSON containing: repository vitals (commit count, authors, last commit, recent activity), file churn (top 40), co-change pairs (top 10 files with their co-changed files), reverts, fix-up commits, contributor concentration (top 20 directories with author breakdown), large commits (>20 files), commit message prefix conventions, and pattern shifts (recent vs older half). Read the output and proceed directly to insight generation — skip the manual collection steps below.

Manual Collection (Fallback)

If the script is unavailable, collect data manually using the steps below.

Step 1: Repository Vitals

git rev-list --count HEAD                        # total commits
git log --format='%ae' | sort -u | wc -l         # unique authors
git log --since="6 months ago" --oneline | wc -l  # recent activity
git log -1 --format='%ci'                         # last commit date

Checkpoint: If total commits <50 or recent activity is zero, adjust scope — the repo may be too small or inactive for meaningful churn analysis. Note this in the report and focus on contributor patterns and structural observations instead.

Step 2: File Churn Analysis

git log --since="6 months ago" --name-only --pretty=format: | sort | uniq -c | sort -rn | head -40

Step 3: Co-Change Analysis

Find files that always change together — this reveals hidden coupling where an agent might miss a required co-change:

# For each of the top 10 high-churn files, check what else changes in the same commits
git log --since="6 months ago" --pretty=format:"%H" -- <file> | head -20 | while read sha; do
  git diff-tree --no-commit-id --name-only -r "$sha"
done | sort | uniq -c | sort -rn | head -10

Step 4: Revert and Fix-up Analysis

git log --since="6 months ago" --oneline --grep="revert" -i
git log --since="6 months ago" --oneline --grep="fix" -i | head -30

Check whether reverts and fixes cluster around specific areas.

Step 5: Contributor Concentration

# For each of the top 20 most-changed directories
git log --since="6 months ago" --format='%ae' -- <directory> | sort | uniq -c | sort -rn

Single-contributor areas signal concentrated implicit knowledge — high KCG-3 risk.

Step 6: Commit Pattern Analysis

# Large commits (many files) — suggest complex, coupled changes
git log --since="6 months ago" --pretty=format:"%H %s" --shortstat | head -100

# Commit message conventions
git log --since="6 months ago" --pretty=format:"%s" | sed 's/(.*//' | sort | uniq -c | sort -rn | head -20

Step 7: Recent Pattern Shifts

Convention changes visible in activity shifts between recent and older code:

# Compare directory activity: last 3 months vs 3-6 months ago
git log --since="3 months ago" --name-only --pretty=format: | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -20
git log --since="6 months ago" --until="3 months ago" --name-only --pretty=format: | sed 's|/[^/]*$||' | sort | uniq -c | sort -rn | head -20

Scope Limits

  • Analyze the last 6 months of history (or 2000 commits, whichever is less)
  • For very active repos, focus on the most recent 3 months for detailed analysis
  • Sample at most 20 directories for contributor concentration analysis
  • Use --since flags consistently to bound queries

What to Look For

Git history is especially good at revealing:

  • KCG-3 (Tribal knowledge): Single-contributor areas where knowledge is concentrated in one person
  • CAS-2 (Inconsistent code patterns): Pattern shifts over time visible in how the same things are done differently in old vs new code
  • SCX-4 (High coupling): Files that always change together despite being in different modules
  • RAF-1, RAF-2, RAF-5: Areas with frequent reverts, fix-up commits, or repeated changes suggest agent (or human) difficulty
  • KCG-5 (Stale documentation): Documentation files that haven't been updated even as the code they describe changed significantly
  • TCG-6 (Unowned content): Context or documentation files whose last meaningful edit is very old and whose historical authors are no longer active

Output

Produce a JSON report conforming to the insight report schema. Save to the path provided by the orchestrator (or .tessl-insights-poc/reports/git-history.json standalone).

Set scope.metrics to include:

  • commits_analyzed: number of commits examined
  • authors_seen: number of unique authors (by email, %ae) in the analysed window (denominator for the commit_authors_impacted hero stat)
  • branches_examined: number of branches checked
  • time_range_days: how many days of history covered
  • problem_commits: array of commit SHAs (short or full) that appear as evidence in this report's insights — i.e. the specific commits used to illustrate reverts, revert-reland cycles, oversized commits, drift commits, etc. Deduplicate across insights. Include every SHA you cite in evidence.
  • commit_authors_impacted: number of distinct authors (by email, %ae) who wrote the commits listed in problem_commits. Derive this by running git log -1 --format='%ae' <sha> for each SHA and counting unique values. This is the numerator for the commit_authors_impacted hero stat in the synthesized report.

The synthesizer uses problem_commits + commit_authors_impacted + authors_seen to populate summary.commit_authors_impacted in findings.json. Every insight that cites commits as evidence must use SHAs that also appear in scope.metrics.problem_commits — if you can't cite concrete SHAs for a "problem commits" style finding, the insight is too vague.

Validation before saving:

  • Verify all required metadata fields present
  • Confirm at least 8 insights with commit SHAs, file paths, or statistics as evidence
  • Check every insight has id, category, impact, effort, priority_score
  • Confirm scope.metrics.problem_commits contains every SHA cited in any insight's evidence and that scope.metrics.commit_authors_impacted matches the count of distinct authors for those SHAs

Mark data_source_exclusive: true for insights from temporal/change data (churn rates, co-change patterns, contributor concentration).

skills

analyze-git-history

README.md

tile.json