General-purpose coding policy for Baruch's AI agents
96
90%
Does it follow best practices?
Impact
97%
Average score across 14 eval scenarios
Passed
No known issues