CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/verify

Verify your own completed code changes using the repo's existing infrastructure and an independent evaluator context. Use after implementing a change when you need to run unit or integration tests, check build or lint gates, prove the real surface works with evidence, and challenge the changed code for clarity, deduplication, and maintainability. If the repo is not verifiable yet, hand off to `agent-readiness`; if you are reviewing someone else's code, use `review`.

97

1.02x
Quality

98%

Does it follow best practices?

Impact

94%

1.02x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files
name:
verify
description:
Self-check your own completed change before handing off to `review` — the pre-review sanity pass. Use when you want to check your work, run checks, validate changes, make sure a change is ready, test it end-to-end, run repo guardrails (lint, typecheck, tests, build), exercise the real surface with evidence, and catch obvious self-correctable issues. Produces a `ready for review` / `needs more work` / `blocked` verdict — never a ship decision. If the repo cannot be booted or exercised reliably, hand off to `agent-readiness`. If auditing someone else's diff, branch, or PR, use `review` instead.

Verify

Self-check your own completed change before handing it off to review. Verify proves the change boots, passes guardrails, and survives the real surface — it is not a substitute for independent review.

Principles

  • Verify is the builder's gate before review; it does not replace review
  • The builder does not grade their own work in the same context — switch into a fresh evaluator context or separate subagent first
  • Run repo guardrails first, then hit the real surface
  • Prefer smoke, integration, contract, or e2e proof over unit tests that mock most of the behavior under test
  • Self-correct obvious issues you spot while exercising the change; leave the rigorous code-shape pass to review
  • Load shared doctrine from the repo's guidance files such as AGENTS.md, CLAUDE.md, or repo rules before judging the result
  • If the infrastructure is too weak to verify reliably, stop and hand off to agent-readiness

Handoffs

  • Verification passed → hand off to review for the independent ship decision
  • No stable boot / smoke / interact path, or infrastructure too weak to trust → use agent-readiness
  • Auditing existing code, a diff, branch, or PR you did not author → use review
  • Main problem is stale AGENTS.md, README, specs, or repo docs → use docs

Before You Start

  1. Define the exact change being verified and the expected user-visible behavior
  2. Switch into an independent evaluator context before judging your own work
  3. Load the target repo's guidance files such as AGENTS.md, CLAUDE.md, or repo rules, when present
  4. Confirm you can boot and interact with the real surface
  5. Pick the smallest check set that can disprove the change honestly

Workflow

1. Run deterministic guardrails first

  • Prefer the repo's built-in entrypoint: make verify, just verify, pnpm test, cargo test, or the nearest targeted equivalent
  • When choosing tests, prefer the strongest cheap proof available: smoke, integration, contract, or e2e checks beat mock-heavy unit suites that mainly replay implementation details
  • Swallow boring success output and surface only failures, anomalies, and exact commands

2. Exercise the real surface

  • UI → run the browser automation, navigate the changed flow, and capture screenshots
  • API → hit the local endpoint with a real request such as curl http://127.0.0.1:3000/health
  • CLI → run the shipped command such as node dist/cli.js --help or the repo's packaged entrypoint
  • state/config → verify round trips, restart behavior, and config boot paths

Follow references/evidence-rules.md when collecting proof.

3. Self-correct obvious issues

While exercising the change, fix anything cheap and obvious that you spot:

  • A typo in a log line, a stale comment, an unused import, a duplicated helper inside the diff
  • An any, unsafe as, or non-null assertion you can replace with a real type in seconds
  • A failure path that swallows errors silently when a one-line throw makes the diagnostic useful

Do not turn this into a full review pass. Substantive code-shape concerns (architecture mismatches, broader duplication, error-classification redesigns) belong to review. Use references/simplification.md only as a short self-check, not as a refactoring backlog.

4. Probe adjacent risk

  • Check the main happy path
  • Check at least one failure path or edge case
  • Check that at least one exercised failure path returns or logs a useful, actionable error instead of a vague or swallowed failure
  • Re-test any config, persistence, or restart-sensitive behavior touched by the change

5. Synthesize the verdict

Produce one clear outcome:

  • ready for review — guardrails green, real surface confirmed, no obvious self-correctable issues left
  • needs more work — the change is not ready to be reviewed; specific issues to address are listed
  • blocked — verification cannot proceed, usually because infrastructure is too weak (hand off to agent-readiness)

Verify never issues ship it. The independent ship decision is review's job.

Output

After verification, report in this compact bullet shape:

  • - verdict: exactly one of ready for review, needs more work, or blocked
  • - evidence: concise explanations of what checks proved, not full commands
  • - fixed during verify: only if self-corrections happened
  • - unverified or gaps: readiness gaps, doc drift, or none
  • - next: review, agent-readiness, docs, or more implementation

Keep the final answer short:

  • Put detailed failures, screenshots, traces, and file references in native findings or the work log, not in the footer
  • Do not repeat command output that already appeared in the terminal
  • Keep the footer to 5 labeled lines or fewer
  • Omit fixed during verify when nothing was corrected
  • Summarize passing checks by intent and result, for example typecheck passed for tv-vite or API smoke check returned 200; include full commands only when they failed, are needed for reproduction, or the user asks for them
  • For failures or blocked checks, include the relevant error/status line or response snippet; do not hide the only evidence in terminal output

Example:

- verdict: ready for review
- evidence: retry tests covered success and failure paths; API retry smoke returned 200
- unverified or gaps: none
- next: review

References

Workspace
uinaf
Visibility
Public
Created
Last updated
Publish Source
CLI
Badge
uinaf/verify badge