CtrlK
BlogDocsLog inGet started
Tessl Logo

coding-agent-helpers/regression-scout

Use when the user wants regression hunting after a change. Identify nearby flows, shared code paths, error states, and configuration edges that may have broken even if the main fix works. Good triggers include "check for regressions", "what else might this have broken", and "test the surrounding area".

96

2.72x
Quality

94%

Does it follow best practices?

Impact

98%

2.72x

Average score across 8 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Evaluation results

100%

86%

Auth Service Codebase Overview

Output format structure

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

0%

100%

Has Risk Left Open section

0%

100%

Regression Checks lists individual checks with results

0%

100%

Findings section gives explicit verdict

0%

100%

Risk Left Open has concrete specific risk

0%

100%

Change Surface identifies auth middleware change

0%

100%

Regression Checks covers auth/session adjacent paths

0%

100%

Report does not primarily re-verify JWT auth works

100%

100%

100%

79%

Payment Service Codebase Overview

Adjacent flow identification

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

71%

100%

Has Risk Left Open section

0%

100%

Change Surface identifies pool.js or maxConnections change

0%

100%

Change Surface mentions at least 2 affected services

0%

100%

Regression Checks covers reporting service batching logic

0%

100%

Regression Checks covers at least one other service path

0%

100%

Regression Checks lists at least 3 checks with results

0%

100%

Risk Left Open has concrete specific risk

0%

100%

Report does not primarily re-verify pool size increase

100%

100%

Findings includes explicit verdict

77%

100%

100%

50%

Search Service Overview

Error and empty-state regression checks

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

0%

100%

Has Risk Left Open section

0%

100%

Change Surface identifies views.py or sort change

87%

100%

Regression Checks includes empty result set check

50%

100%

Regression Checks includes pagination interaction check

80%

100%

Regression Checks includes error or edge path check

60%

100%

Regression Checks lists at least 3 checks with results

75%

100%

Risk Left Open has concrete specific risk

50%

100%

Findings includes explicit verdict

50%

100%

Report does not primarily re-verify sorted results

100%

100%

100%

90%

Product Catalog Service

Config and cache edge regression checks

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

0%

100%

Has Risk Left Open section

0%

100%

Change Surface identifies CACHE_TTL config change

0%

100%

Regression Checks includes stale price or inventory check

0%

100%

Regression Checks includes cache warmer or persistence check

0%

100%

Regression Checks includes auth-related path check

0%

100%

Regression Checks lists at least 3 checks with results

0%

100%

Risk Left Open has concrete specific risk

0%

100%

Findings includes explicit verdict

0%

100%

Report does not primarily re-verify CACHE_TTL was applied

100%

100%

100%

37%

User Service Overview

Bias toward adjacent breakage

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

0%

100%

Has Risk Left Open section

0%

100%

Regression Checks covers at least one non-delete adjacent path

100%

100%

Regression Checks covers at least two different adjacent components

100%

100%

Regression Checks lists at least 3 checks with results

100%

100%

Report does not over-verify soft-delete feature

100%

100%

Risk Left Open has concrete specific risk

50%

100%

Findings includes explicit verdict

37%

100%

Change Surface identifies changed files or soft-delete addition

100%

100%

88%

68%

CLI Tool Overview

Risk documentation when no regressions found

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

0%

100%

Has Risk Left Open section

0%

100%

Findings explicitly states none found or equivalent

0%

0%

Risk Left Open lists at least one concrete specific risk

0%

100%

Risk Left Open does not just say none or no risk

0%

100%

Regression Checks lists at least 3 checks with results

0%

100%

Change Surface identifies parser change or parser.ts

0%

100%

Report covers at least one documented parser behavior difference

100%

100%

Report does not primarily re-verify refactor works

100%

100%

100%

57%

HTTP Service Migration Overview

Full workflow across all zone categories

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

0%

100%

Has Risk Left Open section

0%

100%

Change Surface identifies at least 2 changed files or the chi migration

0%

100%

Regression Checks includes auth or permission path check

75%

100%

Regression Checks includes error or 4xx response path check

75%

100%

Regression Checks includes middleware behavior check

75%

100%

Regression Checks lists at least 4 checks with results

60%

100%

Risk Left Open has concrete specific risk

62%

100%

Findings includes explicit verdict

0%

100%

Report does not primarily re-verify routing works

100%

100%

100%

32%

Transaction Batch Job Overview

Persistence and performance edge regression checks

Criteria
Without context
With context

Has Change Surface section

0%

100%

Has Regression Checks section

0%

100%

Has Findings section

57%

100%

Has Risk Left Open section

0%

100%

Change Surface identifies process_transactions.py or chunked reading

75%

100%

Regression Checks includes data correctness check

90%

100%

Regression Checks includes persistence path check

90%

100%

Regression Checks includes performance or timing edge check

100%

100%

Regression Checks lists at least 3 checks with results

87%

100%

Risk Left Open has concrete specific risk

75%

100%

Findings includes explicit verdict

87%

100%

Report does not primarily re-verify chunked reading works

100%

100%

Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents