Use when the user wants regression hunting after a change. Identify nearby flows, shared code paths, error states, and configuration edges that may have broken even if the main fix works. Good triggers include "check for regressions", "what else might this have broken", and "test the surrounding area".
96
94%
Does it follow best practices?
Impact
98%
2.72xAverage score across 8 eval scenarios
Passed
No known issues
{
"context": "The agent was asked to produce a regression scout report (report.md) for a REST API that added a soft-delete endpoint for users. The criteria evaluate whether the report focuses on adjacent breakage (auth, orders, posts, email, search) rather than re-verifying that soft-delete itself works.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Has Change Surface section",
"description": "The report.md file contains a '### Change Surface' section heading",
"max_score": 7
},
{
"name": "Has Regression Checks section",
"description": "The report.md file contains a '### Regression Checks' section heading",
"max_score": 7
},
{
"name": "Has Findings section",
"description": "The report.md file contains a '### Findings' section heading",
"max_score": 7
},
{
"name": "Has Risk Left Open section",
"description": "The report.md file contains a '### Risk Left Open' section heading",
"max_score": 7
},
{
"name": "Regression Checks covers at least one non-delete adjacent path",
"description": "The Regression Checks section includes at least one check on a non-delete adjacent path: order routes joining with users, post routes filtering by author, email service sending to soft-deleted users, search returning soft-deleted users, or auth middleware authorizing soft-deleted users",
"max_score": 12
},
{
"name": "Regression Checks covers at least two different adjacent components",
"description": "The Regression Checks section covers checks across at least 2 different adjacent components (from: auth middleware, orders routes, posts routes, email service, search service)",
"max_score": 12
},
{
"name": "Regression Checks lists at least 3 checks with results",
"description": "The Regression Checks section lists at least 3 separate checks, each with an outcome or result stated",
"max_score": 8
},
{
"name": "Report does not over-verify soft-delete feature",
"description": "The report does NOT dedicate more than 1 check to verifying that the soft-delete endpoint itself sets deleted_at correctly — the primary focus is on adjacent breakage caused by soft-deleted users still existing in the table",
"max_score": 12
},
{
"name": "Risk Left Open has concrete specific risk",
"description": "The Risk Left Open section contains a concrete specific risk such as: auth middleware authorizing requests from soft-deleted users, email service sending to deleted users, search returning deleted users in results, or orders/posts still associated with deleted users",
"max_score": 8
},
{
"name": "Findings includes explicit verdict",
"description": "The Findings section includes an explicit verdict — either stating no regressions were found or naming specific regressions identified",
"max_score": 8
},
{
"name": "Change Surface identifies changed files or soft-delete addition",
"description": "The Change Surface section identifies the changed files (users.js routes, user.js model) or describes the soft-delete addition and its impact on the users table",
"max_score": 12
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
skills
regression-scout