General-purpose coding policy for Baruch's AI agents
95
91%
Does it follow best practices?
Impact
96%
1.31xAverage score across 10 eval scenarios
Advisory
Suggest reviewing before use
tessl scenario generate skews toward happy-path scenarios — write negative cases by hand using existing scenarios as a structural templatefix/*" is a task with the answer smuggled in.tessl/tiles/... paths, tile-only identifiersgh pr create, REST endpoints, conventional-commits format, semverFixed in <sha>), chosen flags (--ff-only), specific sequences, invented format literals. A competent engineer without the tile would not produce those specific choices; that is precisely why they measure tile value. Checking for them is measuring application, not leakinggh pr merge" is public. "Uses createJwtToken internal action" is tile-internalwith-context score (tile loaded) and the baseline score (tile not loaded). A scenario with near-zero lift on a positive case is telling you one of three things:
fixture-2025-04-17.json)