CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/review

Review existing code, diffs, branches, or pull requests using concern-specific reviewer personas and evidence. Use when auditing someone else's work, triaging risk in a PR, or producing a ship-it / needs-review / blocked verdict. Do not use to verify your own completed change; use `verify` for that.

98

1.20x
Quality

100%

Does it follow best practices?

Impact

96%

1.20x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-1/

{
  "context": "Tests whether the agent performs a thorough code-shape review on a TypeScript module, correctly identifying unsafe type escapes (any, as casts, non-null assertions), dead and duplicate code, catch-all error handling, and comments that only narrate the code — and whether the output follows the expected verdict and findings format.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "any type flagged",
      "description": "Flags the use of 'any' type (in decodeToken and/or parseJwt parameters/return types) as a safety failure or type-safety concern",
      "max_score": 10
    },
    {
      "name": "Unsafe cast flagged",
      "description": "Flags the unsafe 'as' cast in refreshToken (casting decoded payload to a specific shape without validation) as a safety concern",
      "max_score": 10
    },
    {
      "name": "Non-null assertion flagged",
      "description": "Flags the non-null assertion operator (payload.userId!) in getUserId as a safety failure",
      "max_score": 10
    },
    {
      "name": "Dead code identified",
      "description": "Identifies legacyValidate as dead/unused code that should be deleted",
      "max_score": 10
    },
    {
      "name": "Duplicate logic identified",
      "description": "Identifies that decodeToken and parseJwt are functionally identical and flags this as duplication",
      "max_score": 10
    },
    {
      "name": "Catch-all error flagged",
      "description": "Flags the generic catch(e) blocks in validateToken and/or hasPermission that swallow all errors without classification",
      "max_score": 10
    },
    {
      "name": "Error classification recommended",
      "description": "Recommends distinguishing between error types (e.g. malformed token, expired token, missing field) rather than collapsing all failures to a single false return",
      "max_score": 10
    },
    {
      "name": "Narrating comments flagged",
      "description": "Identifies at least two comments that merely restate what the code does (e.g. '// Check if the token exists', '// Decode the token', '// Return the user ID field') as commentary that should be removed",
      "max_score": 10
    },
    {
      "name": "Findings tied to impact",
      "description": "At least one finding includes an explanation of what could break or who is affected (e.g. non-null assertion could throw at runtime in production, catch-all silently hides malformed token attacks)",
      "max_score": 10
    },
    {
      "name": "Valid verdict",
      "description": "Report concludes with one of exactly three verdict values: 'ship it', 'needs review', or 'blocked' — no other phrasing",
      "max_score": 10
    },
    {
      "name": "Compact verdict block",
      "description": "Report does not end with a prose-heavy recap: detailed findings appear once, and verdict metadata is limited to no more than 4 labeled lines with no scope/persona noise unless needed",
      "max_score": 8
    }
  ]
}

evals

scenario-1

criteria.json

task.md

SKILL.md

tile.json