CtrlK
BlogDocsLog inGet started
Tessl Logo

tessleng/agent-insight-experiment

Scan a repository to surface actionable findings about agent performance. Analyzes source code, git history, GitHub data, agent logs, and agent context, then synthesizes cross-referenced findings with targeted actions informed by Tessl product awareness. Supports incremental multi-developer contributions and produces a self-contained HTML report.

70

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

insight-report-schema.mdreferences/

Insight Report Schema

Every analyzer produces a single JSON report file conforming to this schema. The report is the primary output — a companion Markdown summary is optional but encouraged.

File Naming

Save the report as <data_source>-report.json in the experiment workspace directory. For example: source-code-report.json, git-history-report.json.

ID Prefixes

Each data source uses a unique prefix for insight IDs:

Data SourcePrefixExample
Source CodeSRCSRC-001
Git HistoryGITGIT-001
GitHub DataGHGH-001
Agent LogsLOGLOG-001
Agent ContextCTXCTX-001
Context InventoryINVINV-001

Schema

{
  "metadata": {
    "scan_id": "<string — shared across all reports in one scan run>",
    "data_source": "<source_code | git_history | github_data | agent_logs | agent_context | context_inventory>",
    "repository": "<string — org/repo-name or filesystem path>",
    "analysis_timestamp": "<ISO 8601 datetime>",
    "analyzer_model": "<string — model identifier>",
    "scope": {
      "description": "<string — what was analyzed, sampling strategy used, and any limitations>",
      "metrics": {},
      "time_range": "<optional string — date range for temporal data sources>"
    }
  },

  "context_inventory": "<optional object — only the `analyze-context-inventory` report includes this; see that skill's SKILL.md for shape>",

  "executive_summary": "<string — 2-3 paragraphs covering: what was analyzed and the most significant findings>",

  "summary_statistics": {
    "total_insights": "<number>",
    "by_category": {
      "KCG": 0, "CAS": 0, "SCX": 0, "RAF": 0, "TCG": 0
    },
    "by_impact": {
      "critical": 0, "high": 0, "medium": 0, "low": 0
    },
    "by_effort": {
      "trivial": 0, "low": 0, "medium": 0, "high": 0
    },
    "top_quick_wins": ["<array of insight IDs with highest priority_score>"]
  },

  "insights": ["<array of Insight objects — see below>"]
}

Insight Object

{
  "id": "<PREFIX-NNN>",
  "category": "<KCG | CAS | SCX | RAF | TCG>",
  "subcategory": "<e.g., KCG-1, CAS-2>",
  "title": "<string — short, specific, descriptive>",
  "description": "<string — detailed explanation: what the issue is, why it matters for agent performance, and how it manifests>",

  "evidence": [
    {
      "type": "<file_reference | code_snippet | git_log | pr_comment | ci_output | agent_conversation | config_file | review_comment | statistical>",
      "location": "<string — file path, PR URL/number, commit SHA, session ID, etc.>",
      "detail": "<string — what this evidence shows>",
      "snippet": "<optional string — relevant code or text excerpt>"
    }
  ],

  "impact": {
    "score": "<1-10 integer>",
    "level": "<critical | high | medium | low>",
    "reasoning": "<string — why this score: frequency, blast radius, severity>"
  },

  "effort": {
    "score": "<1-10 integer (1 = trivial, 10 = massive rewrite)>",
    "level": "<trivial | low | medium | high>",
    "reasoning": "<string — what the fix involves and roughly how long it would take>"
  },

  "priority_score": "<number — impact.score / effort.score, higher is better>",

  "affected_areas": ["<array of file paths or directory patterns>"],
  "confidence": "<high | medium | low>",
  "data_source_exclusive": "<boolean — could this insight ONLY have been discovered from this data source?>"
}

Scope Metrics by Data Source

The scope.metrics object uses keys appropriate to each data source:

Data SourceTypical Metrics
Source Codefiles_examined, directories_traversed, total_loc_sampled
Git Historycommits_analyzed, authors_seen, branches_examined, problem_commits, commit_authors_impacted
GitHub Dataprs_reviewed, ci_runs_examined, issues_checked, review_comments_read
Agent Logssessions_parsed, conversations_analyzed, tool_calls_examined, tools_represented, sessions_with_frustration_signals
Agent Contextcontext_files_found, rules_analyzed, skills_inventoried
Context Inventoryfiles_inventoried, by_category (map of category → count), link_edges (count of follow-links)

Hero-stat metrics

A few scope.metrics fields are used by the synthesizer to populate the hero stats shown at the top of the report. See the individual skill docs for exact definitions, but briefly:

  • git_historyproblem_commits (array of SHAs cited as evidence), commit_authors_impacted (distinct authors of those SHAs). Used to compute summary.commit_authors_impacted in findings.json. authors_seen supplies the denominator.
  • agent_logssessions_with_frustration_signals (distinct sessions showing frustration/correction signals). Summed across all contributors' reports to produce summary.sessions_with_frustration_signals. sessions_parsed supplies the denominator.

Analyzers are responsible for these numbers — synthesis never re-derives them from free-text evidence.

Scoring Guide

Impact (1-10)

ScoreLevelMeaning
9-10CriticalAffects nearly every agent task; causes frequent, severe failures
7-8HighAffects many tasks or causes significant failures in important areas
4-6MediumAffects some tasks or causes moderate confusion
1-3LowAffects few tasks or causes minor inefficiency

Effort (1-10)

ScoreLevelMeaning
1-2TrivialA few minutes: add a comment, create a short context file, flip a config
3-4LowUnder an hour: write documentation, add a rule, create a simple skill
5-7MediumHours to a day: refactor a module, write comprehensive docs, add tests
8-10HighDays+: major refactoring, architectural changes, large-scale cleanup

Confidence

  • High: Multiple pieces of clear evidence; the issue is unambiguous
  • Medium: Evidence supports the finding but there could be context you're missing
  • Low: Inference based on limited data; flagged as worth investigating

references

apex-taxonomy.md

findings-schema.md

insight-report-schema.md

report-template.html

tessl-product-context.md

README.md

tile.json