General-purpose coding policy for Baruch's AI agents
95
91%
Does it follow best practices?
Impact
96%
1.31xAverage score across 10 eval scenarios
Advisory
Suggest reviewing before use
When the agent's interactive Step-2 (PR create) and Step-7 (merge + cleanup) instructions get encoded into a reusable script that other devs run unattended (release.sh, merge-and-cleanup.sh, etc.), the script has to carry the SAME gates the interactive agent does — otherwise the script bypasses the conventions this skill exists to hold. Pulled out of SKILL.md to keep the main workflow scannable.
The script MUST enforce all of:
<type>(<scope>): <imperative summary>. Taking the title as a raw argument and passing it straight to gh pr create defeats the convention; either build the title from <type>, <scope>, <summary> inputs, or regex-validate the supplied title before push.**Author-Model:** line, preferred bold form — the script should preserve or emit the bold marker (**Author-Model:**) per rules/author-model-declaration.md's "Explicit (preferred)" form. The reviewer prompts also accept bare Author-Model: as a fallback, so the script won't break by emitting the unstyled form, but a wrapper that drops the marker line entirely turns every reviewer run into an early REQUEST_CHANGES (the rule's "Neither present is a policy violation" clause).The script MUST enforce all of:
success (or none), every bot review is APPROVED or non-blocking COMMENTED, no review thread is unresolved. Fail loudly if any gate is red instead of proceeding.git branch -d <branch> (refuses to drop unmerged work), never git branch -D. The whole point of the cleanup is "ship and tidy"; clobbering an unmerged branch with -D defeats the safety the merge gate just established.rules/agent-worktree-isolation.md), it must detect that case and run the worktree variant of cleanup: cd to the base checkout, fast-forward main, git worktree remove <worktree-root>, then git branch -d <branch>. Three derivations the wrapper needs:
[ "$(git rev-parse --git-dir)" != "$(git rev-parse --git-common-dir)" ]; true means you're in an additional worktreegit rev-parse --show-toplevel (required by git worktree remove, which refuses subdirectory paths with not a working tree)git worktree list --porcelain | awk '/^worktree / {print $2; exit}' (the first worktree entry is always the main checkout)
Skipping this branch makes git branch -d fail with a confusing "checked out at <path>" error and leaves the worktree directory plus .git/worktrees/ metadata stranded; the script must surface a coherent teardown, not that error.main actually advanced to the merge commit; (2) the publish/release workflow reached a terminal state — derive the merge commit SHA, resolve the publish run whose headSha equals that SHA, then gh run watch <id> on that exact run (this is a timing precondition for the registry check in (3), not the gate — exit code is approximate, not authoritative; see rules/ci-safety.md for why). Never select by "latest on main" or any branch-only/limit-1 heuristic: main may advance again before the script starts watching, producing a false positive against the wrong run. Also never reduce this to "check it triggered" (fire-and-forget would let the script proceed to the registry check while the publish was still in flight); (3) registry advanced — capture the registry's Latest Version BEFORE the merge as a baseline (Tessl: tessl tile info <workspace>/<tile>; other registries: npm view, pypi JSON, etc.) and confirm post-publish that it has increased past that baseline. Registry-advanced is the authoritative signal of "publish landed"; do NOT use the publish workflow's exit code as the gate (workflows can succeed without running the publish step or fail in post-publish cleanup, so exit code is approximate not authoritative — the registry's actual state is the only ground truth). Do not derive an expected version from the merge SHA's manifest and compare against it — the bump runs inside the publish workflow after that SHA is read (the merge SHA still has the pre-bump value), and another merge interleaving in a busy repo would advance the registry past your specific version anyway (not a failure). The script must NOT encode "moderation hold", "moderation queue", or "moderation rejection" as expected intermediate states or retry/wait conditions: per rules/ci-safety.md, the Tessl registry never rejects a published version (moderationPassed: false means won't-surface-in-search; security finding means install-flag-required — neither is rejection). Per rules/ci-safety.md's extended Always Watch CI duty.A scripted run is unattended. Anything the interactive agent enforces by reading SKILL.md only protects the sessions where the SKILL.md is in the agent's context. A script that doesn't carry the same gates internally hands every consumer a sharper version of the bypass-by-automation problem.