CtrlK
BlogDocsLog inGet started
Tessl Logo

jbaruch/coding-policy

General-purpose coding policy for Baruch's AI agents

91

1.24x
Quality

92%

Does it follow best practices?

Impact

91%

1.24x

Average score across 9 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-1/

{
  "context": "Tests whether the agent follows the correct merge command with the right flags, uses safe git pull semantics post-merge, properly cleans up local and remote branch refs, and performs the required post-merge verification steps including confirming the publish workflow was triggered.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Merge flags: --merge",
      "description": "Uses `gh pr merge` with the `--merge` flag (not `--squash` or `--rebase`)",
      "max_score": 10
    },
    {
      "name": "Merge flags: --delete-branch",
      "description": "Uses `gh pr merge` with the `--delete-branch` flag to delete the remote branch on merge",
      "max_score": 10
    },
    {
      "name": "Fast-forward only pull",
      "description": "After switching to main, uses `git pull --ff-only` (not plain `git pull` or `git pull --rebase`)",
      "max_score": 12
    },
    {
      "name": "Checkout main first",
      "description": "Switches to the main branch (`git checkout main`) before running `git pull`",
      "max_score": 8
    },
    {
      "name": "Local branch deletion",
      "description": "Deletes the local feature branch with `git branch -d` (not force-delete `-D`)",
      "max_score": 10
    },
    {
      "name": "Remote ref pruning",
      "description": "Runs `git remote prune origin` to clean up stale remote-tracking refs",
      "max_score": 10
    },
    {
      "name": "Verify merge on main",
      "description": "Includes a step to verify the merged commit is present on the main branch (e.g. `git log`, `gh pr view`, or checking PR status)",
      "max_score": 10
    },
    {
      "name": "Publish CI check",
      "description": "Includes a step to check that the publish/release CI workflow was triggered after the merge (e.g. checking workflow runs)",
      "max_score": 12
    },
    {
      "name": "Report merged PR URL",
      "description": "Script or documentation includes reporting the merged PR URL as part of the outcome summary",
      "max_score": 8
    },
    {
      "name": "Pre-merge: CI green gate",
      "description": "Script or checklist requires confirming CI is green before merging — does not proceed with merge if CI is failing",
      "max_score": 10
    }
  ]
}

evals

scenario-1

criteria.json

task.md

README.md

tile.json