CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/verify

Verify your own completed code changes using the repo's existing infrastructure and an independent evaluator context. Use after implementing a change when you need to run unit or integration tests, check build or lint gates, prove the real surface works with evidence, and challenge the changed code for clarity, deduplication, and maintainability. If the repo is not verifiable yet, hand off to `agent-readiness`; if you are reviewing someone else's code, use `review`.

97

1.02x
Quality

98%

Does it follow best practices?

Impact

94%

1.02x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-1/

{
  "context": "Tests whether verify refuses to call a change ready when the repo has no reliable boot/test/runtime surface, reports exact attempted commands, and hands off to agent-readiness instead of pretending static inspection is proof.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Blocked verdict",
      "description": "The report verdict is exactly `blocked`, not `ready for review`, `ship it`, or an optimistic equivalent",
      "max_score": 16
    },
    {
      "name": "Commands attempted",
      "description": "The report lists exact commands attempted to find or run verification infrastructure, such as package scripts, tests, boot commands, or lockfile checks",
      "max_score": 12
    },
    {
      "name": "Missing infra evidence",
      "description": "The report includes concrete evidence that verification infrastructure is missing or unusable, not just a generic statement that there are no tests",
      "max_score": 12
    },
    {
      "name": "No static proof substitution",
      "description": "The report does not treat reading `src/worker.ts` or reasoning about the function as equivalent to runtime verification",
      "max_score": 14
    },
    {
      "name": "Readiness gaps listed",
      "description": "The report names the missing readiness pieces, such as lockfile/install path, test script, boot command, smoke check, or service entrypoint",
      "max_score": 12
    },
    {
      "name": "agent-readiness handoff",
      "description": "The recommended follow-up is `agent-readiness` or an equivalent readiness-building task before review",
      "max_score": 14
    },
    {
      "name": "Surfaces not exercised honestly",
      "description": "The report is honest about what was attempted and what could not be exercised, without requiring a verbose Surfaces Exercised section",
      "max_score": 10
    },
    {
      "name": "No invented tests",
      "description": "The solution does not add ad hoc tests or a custom harness solely to claim verification passed; it reports the repo readiness gap",
      "max_score": 10
    },
    {
      "name": "Compact blocked footer",
      "description": "The final blocked verification footer is no more than 5 labeled lines and does not repeat the missing-infra evidence after listing it once",
      "max_score": 8
    }
  ]
}

evals

scenario-1

criteria.json

task.md

SKILL.md

tile.json