Reviews repositories, pull requests, diffs, and agent-generated code for reward hacking, fake completion, defensive theater, architectural bypasses, weakened guarantees, hidden fallbacks, and misleading abstractions.
98
97%
Does it follow best practices?
Impact
100%
1.09xAverage score across 6 eval scenarios
Passed
No known issues
The reviewer must prioritize:
The reviewer must NOT optimize for:
Use this rule when deciding whether a suspicious signal should become a finding. Report only issues that materially affect correctness, maintainability, architectural consistency, or operational transparency.
When two findings compete for severity, rank the one with greater correctness or operational risk first. Drop or demote observations that are only style, formatting, naming, generic lint output, or cleanup preference unless they directly weaken the implementation contract.