Reviews repositories, pull requests, diffs, and agent-generated code for reward hacking, fake completion, defensive theater, architectural bypasses, weakened guarantees, hidden fallbacks, and misleading abstractions.
98
97%
Does it follow best practices?
Impact
100%
1.09xAverage score across 6 eval scenarios
Passed
No known issues
Scanned