CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/verify

Verify your own completed code changes using the repo's existing infrastructure and an independent evaluator context. Use after implementing a change when you need to run unit or integration tests, check build or lint gates, prove the real surface works with evidence, and challenge the changed code for clarity, deduplication, and maintainability. If the repo is not verifiable yet, hand off to `agent-readiness`; if you are reviewing someone else's code, use `review`.

97

1.02x
Quality

100%

Does it follow best practices?

Impact

89%

1.02x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-1/

{
  "context": "Tests whether the agent performs a thorough code-shape review on a TypeScript module, correctly identifying unsafe type escapes (any, as casts, non-null assertions), dead and duplicate code, catch-all error handling, and comments that only narrate the code — and whether the output follows the expected verdict and findings format.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "any type flagged",
      "description": "Flags the use of 'any' type (in decodeToken and/or parseJwt parameters/return types) as a safety failure or type-safety concern",
      "max_score": 10
    },
    {
      "name": "Unsafe cast flagged",
      "description": "Flags the unsafe 'as' cast in refreshToken (casting decoded payload to a specific shape without validation) as a safety concern",
      "max_score": 10
    },
    {
      "name": "Non-null assertion flagged",
      "description": "Flags the non-null assertion operator (payload.userId!) in getUserId as a safety failure",
      "max_score": 10
    },
    {
      "name": "Dead code identified",
      "description": "Identifies legacyValidate as dead/unused code that should be deleted",
      "max_score": 10
    },
    {
      "name": "Duplicate logic identified",
      "description": "Identifies that decodeToken and parseJwt are functionally identical and flags this as duplication",
      "max_score": 10
    },
    {
      "name": "Catch-all error flagged",
      "description": "Flags the generic catch(e) blocks in validateToken and/or hasPermission that swallow all errors without classification",
      "max_score": 10
    },
    {
      "name": "Error classification recommended",
      "description": "Recommends distinguishing between error types (e.g. malformed token, expired token, missing field) rather than collapsing all failures to a single false return",
      "max_score": 10
    },
    {
      "name": "Narrating comments flagged",
      "description": "Identifies at least two comments that merely restate what the code does (e.g. '// Check if the token exists', '// Decode the token', '// Return the user ID field') as commentary that should be removed",
      "max_score": 10
    },
    {
      "name": "Findings tied to impact",
      "description": "At least one finding includes an explanation of what could break or who is affected (e.g. non-null assertion could throw at runtime in production, catch-all silently hides malformed token attacks)",
      "max_score": 10
    },
    {
      "name": "Valid verdict",
      "description": "Report concludes with one of exactly three verdict values: 'ship it', 'needs review', or 'blocked' — no other phrasing",
      "max_score": 10
    }
  ]
}

evals

scenario-1

criteria.json

task.md

SKILL.md

tile.json