Use when the user wants an adversarial double-check of a code or config change. Run the strongest checks available, try to break the claim, look for edge cases and hidden regressions, and return PASS, PARTIAL, or FAIL with evidence. Good triggers include "poke holes in this", "stress test this change", "double check this fix", and "try to break it".
84
94%
Does it follow best practices?
Impact
81%
1.30xAverage score across 8 eval scenarios
Passed
No known issues
An engineer on the integrations team fixed a bug where HTTP requests to a third-party API were failing silently when the API returned a 503 error during high-traffic periods. The fix adds retry logic that retries up to three times with exponential backoff. The engineer's code is verbose and uses an unconventional style, but they insist the retry behavior is correct.
The team lead wants an independent check of whether the retry logic actually works as claimed: retries on 503, gives up after three attempts, and uses exponential backoff. The code style is not a concern — what matters is whether the behavior is correct.
Write your verification to verification_report.md. Run the code if you can. State clearly whether the retry behavior works as claimed.
The following files are provided as inputs. Extract them before beginning.
=============== FILE: src/api_client.py =============== import time
def fetch_with_retry(url, http_get_fn, max_attempts=3, base_delay=1): """ Fetch a URL with retry logic. Claimed behavior: - Retries up to max_attempts times on HTTP 503 - Uses exponential backoff between retries (base_delay * 2^attempt) - Gives up after max_attempts and raises an exception """ attempt_number = 0 last_exception = None
while True:
if attempt_number >= max_attempts:
raise Exception(f"Failed after {max_attempts} attempts. Last error: {last_exception}")
try:
resp = http_get_fn(url)
except Exception as excptn:
last_exception = excptn
attempt_number = attempt_number + 1
DELAY = base_delay * (2 ** attempt_number)
time.sleep(DELAY)
continue
if resp["status"] == 503:
last_exception = f"HTTP 503 on attempt {attempt_number + 1}"
attempt_number = attempt_number + 1
DELAY = base_delay * (2 ** attempt_number)
time.sleep(DELAY)
continue
# success
return resp=============== FILE: tests/test_api_client.py =============== import sys sys.path.insert(0, 'src') import time
import api_client sleep_calls = [] def fake_sleep(seconds): sleep_calls.append(seconds) api_client.time.sleep = fake_sleep
from api_client import fetch_with_retry
def make_counter_fn(responses): """Returns a function that returns responses in sequence.""" state = {"i": 0} def fn(url): resp = responses[state["i"]] state["i"] += 1 return resp return fn
passed = 0 failed = 0
def check(name, fn): global passed, failed try: fn() print(f"PASS: {name}") passed += 1 except Exception as e: print(f"FAIL: {name} — {e}") failed += 1
def test_success_first_try(): sleep_calls.clear() fn = make_counter_fn([{"status": 200, "body": "ok"}]) resp = fetch_with_retry("http://x", fn) assert resp["status"] == 200 assert len(sleep_calls) == 0
def test_retry_on_503(): sleep_calls.clear() fn = make_counter_fn([ {"status": 503}, {"status": 503}, {"status": 200, "body": "ok"} ]) resp = fetch_with_retry("http://x", fn) assert resp["status"] == 200
def test_gives_up_after_max(): sleep_calls.clear() fn = make_counter_fn([{"status": 503}] * 5) try: fetch_with_retry("http://x", fn, max_attempts=3) assert False, "Should have raised" except Exception as e: assert "Failed after 3 attempts" in str(e)
check("success_first_try", test_success_first_try) check("retry_on_503", test_retry_on_503) check("gives_up_after_max", test_gives_up_after_max) print(f"\n{passed} passed, {failed} failed")
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
skills
skeptic-verifier