uinaf/verify

Verify your own completed code changes using the repo's existing infrastructure and an independent evaluator context. Use after implementing a change when you need to run unit or integration tests, check build or lint gates, prove the real surface works with evidence, and challenge the changed code for clarity, deduplication, and maintainability. If the repo is not verifiable yet, hand off to `agent-readiness`; if you are reviewing someone else's code, use `review`.

1.02x

Quality

100%

Does it follow best practices?

Impact

89%

1.02x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Calculator Library Verification

Name: uinaf/verify
Rating: 97.8 (1 reviews)
Author: uinaf

Problem/Feature Description

A developer on the data team has updated a shared Python calculator utility that is used across several internal reporting scripts. The update added a calculate dispatcher function that routes operations by name, and the developer says all the tests pass. Before the change is merged and the library is published to the internal package registry, another engineer needs to independently verify it.

The team uses pytest and a Makefile for their standard verification workflow. Your job is to verify the changes, exercise the real behavior (not just read the code), and produce a complete verification report. The previous version of this library had a silent failure bug where division errors were swallowed — so error path coverage is important.

Output Specification

Produce a file named verification-report.md containing a complete report with all of the following sections:

Verdict: one of ship it, needs review, or blocked
Change Verified: what was tested and confirmed
Surfaces Exercised: the exact commands or calls used
Code-Shape Findings: observations on code clarity, duplication, dead code, or error handling quality
Top Findings: any issues found, with severity
Exact Evidence: commands run and output received
Readiness Gaps: any gaps in test coverage or infrastructure
Recommended Follow-up: suggested next steps

Input Files

The following files are provided as inputs. Extract them before beginning.

=============== FILE: inputs/calculator.py =============== def add(a, b): return a + b

def subtract(a, b): return a - b

def multiply(a, b): return a * b

def divide(a, b): if b == 0: raise ValueError("Cannot divide by zero") return a / b

def calculate(operation, a, b): ops = { 'add': add, 'subtract': subtract, 'multiply': multiply, 'divide': divide, } if operation not in ops: raise ValueError(f"Unknown operation: {operation}") return ops[operation](a, b)

=============== FILE: inputs/test_calculator.py =============== import pytest import sys import os sys.path.insert(0, os.path.dirname(file)) from calculator import add, subtract, multiply, divide, calculate

def test_add(): assert add(2, 3) == 5

def test_subtract(): assert subtract(5, 3) == 2

def test_multiply(): assert multiply(3, 4) == 12

def test_divide(): assert divide(10, 2) == 5.0

def test_divide_by_zero(): with pytest.raises(ValueError, match="Cannot divide by zero"): divide(5, 0)

def test_calculate_add(): assert calculate('add', 10, 5) == 15

def test_calculate_unknown_op(): with pytest.raises(ValueError, match="Unknown operation"): calculate('modulo', 5, 3)

=============== FILE: inputs/Makefile =============== .PHONY: test verify lint

verify: test

test: cd inputs && python -m pytest test_calculator.py -v

lint: cd inputs && python -m flake8 calculator.py

evals

scenario-1

scenario-2

criteria.json

task.md

scenario-3

references

SKILL.md

tile.json

uinaf/verify

task.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-2/

Calculator Library Verification

Problem/Feature Description

Output Specification

Input Files

task.mdevals/scenario-2/