General-purpose coding policy for Baruch's AI agents
90
91%
Does it follow best practices?
Impact
90%
1.76xAverage score across 18 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests the migrate-to-plugin skill's judgment core: after the deterministic manifest migration, the agent must keep live contract surfaces untouched rather than blindly replacing every `tile` with `plugin`. A baseline agent told to 'migrate to plugin form' tends to sweep every 'tile', which breaks the v1/tiles REST route and code identifiers; the skill teaches the keep boundary. Weight concentrates on those keep criteria plus the distinguish-prose-from-contract judgment — that is where the skill beats baseline.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Runs the migration mechanics",
"description": "The plan converts the manifest with `tessl plugin migrate` (or the skill's migrate script), renames `.tileignore` to `.tesslignore`, and removes the now-obsolete root `tile.json`. Fails if it skips the conversion, hand-writes `.tessl-plugin/plugin.json`, or leaves tile.json in place as the live manifest",
"max_score": 14
},
{
"name": "Renames the tessl tile CLI alias",
"description": "`tessl tile info` in check-publish.sh becomes `tessl plugin info` (the canonical command alias). Fails if the CLI invocation is left as `tessl tile`",
"max_score": 12
},
{
"name": "Keeps the v1/tiles API route",
"description": "The `tessl api v1/tiles/<workspace>/<name>/versions/<version>` route is left unchanged — it is a live REST endpoint, not prose. Fails if the plan rewrites it to `v1/plugins/...` or otherwise alters the `v1/tiles` path",
"max_score": 24
},
{
"name": "Keeps code identifiers",
"description": "The `tileRegistry` variable and `fetchTile` function in registry.ts stay as-is — renaming identifiers is out of scope and risks breaking call sites. Fails if the plan renames either identifier to a 'plugin' form",
"max_score": 20
},
{
"name": "Keeps the legacy tile.json reference",
"description": "The CHANGELOG line 'Migrated the build from a hand-written tile.json' keeps the literal `tile.json` — it is a factual historical reference to the old manifest filename. Fails if the plan rewrites it to plugin.json, which would make the entry false",
"max_score": 14
},
{
"name": "Distinguishes prose from contract rather than blanket-replacing",
"description": "The plan treats the edits per-occurrence (renaming package-sense prose, preserving contracts) rather than proposing a global s/tile/plugin/ across the repo. Fails if it describes or implies an undifferentiated find-and-replace",
"max_score": 16
}
]
}.tessl-plugin
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
rules
skills
adopt-fork-pr
eval-curation
install-reviewer
migrate-to-plugin