CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/review

Review existing code, diffs, branches, or pull requests using concern-specific reviewer personas and evidence. Use when auditing someone else's work, triaging risk in a PR, or producing a ship-it / needs-review / blocked verdict. Do not use to verify your own completed change; use `verify` for that.

98

1.31x
Quality

100%

Does it follow best practices?

Impact

92%

1.31x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-3/

{
  "context": "Tests whether the agent correctly applies minimal persona selection for a tiny documentation-only change, avoids reviewer spam, correctly applies the shape-based shortcut for doc-heavy changes, produces a valid verdict, and does not over-inflate minor findings into defects.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Minimal personas selected",
      "description": "The report uses only 1-2 personas, NOT all 6 personas — specifically avoids spawning silent-failures, types, and cleanup for this purely documentary change",
      "max_score": 12
    },
    {
      "name": "General persona used",
      "description": "The report includes the general reviewer persona (or equivalent broad code review lens)",
      "max_score": 8
    },
    {
      "name": "Comments persona used",
      "description": "The report includes the comments reviewer persona (or equivalent documentation/docstring lens) — appropriate given docstring change",
      "max_score": 10
    },
    {
      "name": "Tests persona omitted",
      "description": "The report does NOT apply a tests reviewer persona — no test coverage concern applies to a typo/docstring fix",
      "max_score": 8
    },
    {
      "name": "Verdict present",
      "description": "The report contains exactly one verdict label: 'ship it', 'needs review', or 'blocked'",
      "max_score": 10
    },
    {
      "name": "Scope stated",
      "description": "The report explicitly names the scope reviewed (e.g. the two files changed, README and format.ts, or the specific diff)",
      "max_score": 8
    },
    {
      "name": "Personas listed in output",
      "description": "The report explicitly names the reviewer personas used in a dedicated section or label",
      "max_score": 8
    },
    {
      "name": "No nit inflation",
      "description": "The report does NOT flag the parameter rename (amount → value) as a defect or high-severity finding — the JSDoc now correctly matches the actual parameter name",
      "max_score": 10
    },
    {
      "name": "Docstring accuracy noted",
      "description": "The report identifies that the @param name was corrected from 'amount' to 'value' to match the actual function signature — and treats this as an accuracy improvement, not a bug",
      "max_score": 8
    },
    {
      "name": "Unverified areas or residual risk",
      "description": "The report mentions any residual risk or unverified surfaces, OR explicitly states there are none (either outcome is acceptable — silence is not)",
      "max_score": 8
    },
    {
      "name": "Recommended follow-up",
      "description": "The report includes a recommended follow-up action from the allowed set: implementation, verify, agent-readiness, or docs",
      "max_score": 10
    }
  ]
}

evals

SKILL.md

tile.json