CtrlK
BlogDocsLog inGet started
Tessl Logo

coding-agent-helpers/regression-scout

Use when the user wants regression hunting after a change. Identify nearby flows, shared code paths, error states, and configuration edges that may have broken even if the main fix works. Good triggers include "check for regressions", "what else might this have broken", and "test the surrounding area".

96

2.72x
Quality

94%

Does it follow best practices?

Impact

98%

2.72x

Average score across 8 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-7/

{
  "context": "The agent was asked to produce a regression scout report (report.md) for a Go HTTP service migrated from net/http to the chi router framework. No handler logic was changed. The criteria evaluate comprehensive coverage across all regression zone categories: auth/permission paths, error response paths, middleware behavior, and output format.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Has Change Surface section",
      "description": "The report.md file contains a '### Change Surface' section heading",
      "max_score": 7
    },
    {
      "name": "Has Regression Checks section",
      "description": "The report.md file contains a '### Regression Checks' section heading",
      "max_score": 7
    },
    {
      "name": "Has Findings section",
      "description": "The report.md file contains a '### Findings' section heading",
      "max_score": 7
    },
    {
      "name": "Has Risk Left Open section",
      "description": "The report.md file contains a '### Risk Left Open' section heading",
      "max_score": 7
    },
    {
      "name": "Change Surface identifies at least 2 changed files or the chi migration",
      "description": "The Change Surface section identifies at least 2 of the following: router/router.go, middleware/logging.go, main.go, or the chi framework migration itself",
      "max_score": 8
    },
    {
      "name": "Regression Checks includes auth or permission path check",
      "description": "The Regression Checks section includes a check specifically on authentication or permission enforcement — such as whether JWT auth middleware still correctly protects routes, or whether admin/write permissions are enforced correctly after the migration",
      "max_score": 8
    },
    {
      "name": "Regression Checks includes error or 4xx response path check",
      "description": "The Regression Checks section includes a check on error response behavior — specifically 404 Not Found or 405 Method Not Allowed responses, which changed format in chi compared to net/http defaults",
      "max_score": 8
    },
    {
      "name": "Regression Checks includes middleware behavior check",
      "description": "The Regression Checks section includes a check on at least one middleware behavior: CORS header correctness, logging middleware output, or rate limiting behavior after the middleware stack was reconfigured for chi",
      "max_score": 8
    },
    {
      "name": "Regression Checks lists at least 4 checks with results",
      "description": "The Regression Checks section lists at least 4 separate checks, each with an outcome or result stated",
      "max_score": 10
    },
    {
      "name": "Risk Left Open has concrete specific risk",
      "description": "The Risk Left Open section contains a concrete specific risk such as: middleware execution order differences causing auth to run before or after logging in unexpected ways, CORS header differences for preflight requests, path parameter extraction differences between ServeMux regex and chi {id} style, or 404/405 response format differences breaking API clients",
      "max_score": 8
    },
    {
      "name": "Findings includes explicit verdict",
      "description": "The Findings section includes an explicit verdict — either stating no regressions were found or naming specific regressions identified",
      "max_score": 8
    },
    {
      "name": "Report does not primarily re-verify routing works",
      "description": "The report does NOT dedicate more than 1 check to verifying that basic happy-path routing works correctly — the primary focus is on edge cases, error paths, middleware behavior, and auth/permission enforcement after the migration",
      "max_score": 14
    }
  ]
}

evals

tile.json