General-purpose coding policy for Baruch's AI agents
91
92%
Does it follow best practices?
Impact
91%
Average score across 9 eval scenarios
Advisory
Suggest reviewing before use
test_*.py
*.test.ts
*_test.go
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
rules
skills
eval-authoring
release