Use when the user wants an adversarial double-check of a code or config change. Run the strongest checks available, try to break the claim, look for edge cases and hidden regressions, and return PASS, PARTIAL, or FAIL with evidence. Good triggers include "poke holes in this", "stress test this change", "double check this fix", and "try to break it".
84
94%
Does it follow best practices?
Impact
81%
1.30xAverage score across 8 eval scenarios
Passed
No known issues
A backend engineer on the payments team submitted a PR claiming to fix a long-standing bug in the pagination utility. Users had been reporting that requesting the "last page" of transaction records sometimes returned duplicate items or skipped the final record depending on the total record count. The engineer's fix modifies how the page boundary is calculated when the total count is an exact multiple of the page size.
The code has been reviewed by one other engineer who approved it, but the team lead wants an adversarial second opinion before merging, because the last time a pagination fix shipped it broke exports for enterprise customers. You have been asked to stress test the claim that this fix correctly handles all boundary conditions.
Produce a single markdown file named verification_report.md containing your full adversarial verification of the fix. Include the steps you took to test it and your conclusion.
The following files are provided as inputs. Extract them before beginning.
=============== FILE: src/paginate.py =============== def paginate(items, page_size, page_number): """ Returns a slice of items for the given 1-based page number. page_size: number of items per page page_number: 1-based index of the desired page """ if page_size <= 0: raise ValueError("page_size must be positive") start = (page_number - 1) * page_size end = start + page_size return items[start:end]
def total_pages(total_items, page_size): """ Returns the total number of pages needed to display all items. FIXED: previously used integer division which gave wrong result when total_items is an exact multiple of page_size. """ if page_size <= 0: raise ValueError("page_size must be positive") # Old (buggy): return total_items // page_size return (total_items + page_size - 1) // page_size
def get_last_page_items(items, page_size): """Returns items on the last page.""" n_pages = total_pages(len(items), page_size) return paginate(items, page_size, n_pages)
=============== FILE: tests/test_paginate.py =============== import sys sys.path.insert(0, 'src') from paginate import paginate, total_pages, get_last_page_items
def test_normal_case(): items = list(range(10)) assert paginate(items, 3, 1) == [0, 1, 2] assert paginate(items, 3, 3) == [6, 7, 8]
def test_total_pages(): assert total_pages(10, 3) == 4 assert total_pages(9, 3) == 3 # exact multiple assert total_pages(0, 3) == 0
def test_last_page(): items = list(range(9)) result = get_last_page_items(items, 3) assert result == [6, 7, 8]
if name == "main": passed = 0 failed = 0 for name, fn in [("test_normal_case", test_normal_case), ("test_total_pages", test_total_pages), ("test_last_page", test_last_page)]: try: fn() print(f"PASS: {name}") passed += 1 except AssertionError as e: print(f"FAIL: {name} - {e}") failed += 1 print(f"\n{passed} passed, {failed} failed")
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
skills
skeptic-verifier