{
  "context": "Tests whether the agent converts a universal rule to a conditional one entirely in the rule file frontmatter: flip alwaysApply from true to false AND add an applyTo field with the glob+prose em-dash pattern. In the plugin manifest form, scope lives only in the rule file frontmatter — .tessl-plugin/plugin.json lists rule paths and carries no per-rule config, so the conversion does not touch the manifest. Baseline agents typically add applyTo but forget to flip the existing alwaysApply: true (leaving the rule universal), omit the natural-language clause, or — carrying the legacy tile.json mental model — try to flip a non-existent manifest scope field or inject a steering map into plugin.json. The plugin prescribes the frontmatter-only conversion with the manifest left untouched.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Rule file frontmatter flipped to alwaysApply: false",
      "description": "The rules/pr-template-checklist.md frontmatter's alwaysApply value is changed from true to false. Scores zero if the value remains true (the rule stays universal — the most common baseline failure) or if the field is removed entirely",
      "max_score": 22
    },
    {
      "name": "Rule file frontmatter gains applyTo with glob patterns",
      "description": "The rules/pr-template-checklist.md frontmatter contains a new applyTo field (or accepted alias: globs, paths) whose value includes glob patterns matching PR-related artifacts. The patterns must include at least one match for `.github/PULL_REQUEST_TEMPLATE` (with or without the `.md` extension, with or without the directory variant). Scores zero if the field is missing or contains no glob patterns at all; partial credit (12) if globs are present but do not match PR templates specifically",
      "max_score": 22
    },
    {
      "name": "applyTo value combines globs with a natural-language clause",
      "description": "The applyTo value includes both a glob list and a natural-language clause separated by a literal em dash (—, U+2014), e.g., '.github/PULL_REQUEST_TEMPLATE*, CONTRIBUTING.md — when authoring or editing PR artifacts'. The clause expresses the action-level scope in prose. Scores zero if the value is glob-only, prose-only, or uses a different separator (hyphen, double hyphen, en dash) where the rule prescribes an em dash",
      "max_score": 18
    },
    {
      "name": "plugin.json carries no per-rule config and its rules array is intact",
      "description": "The .tessl-plugin/plugin.json rules array still lists rules/pr-template-checklist.md (and rules/commit-conventions.md), and the agent does NOT add a steering map or any per-rule alwaysApply/applyTo field to the manifest. Scope is declared only in the rule file frontmatter. Scores zero if the agent injects manifest-level scope (the legacy tile.json model) or drops the rule path from the array",
      "max_score": 14
    },
    {
      "name": "Rule body content is preserved unchanged",
      "description": "The body of rules/pr-template-checklist.md (everything after the frontmatter block) is byte-identical to the input. The scenario is a frontmatter-only edit; modifying the body bullets is out of scope and counts against the agent",
      "max_score": 12
    },
    {
      "name": "Existing rule (commit-conventions) is preserved unchanged",
      "description": "The rules/commit-conventions.md path remains in the plugin.json rules array. Scores zero if it is dropped or renamed",
      "max_score": 12
    }
  ]
}

.tessl-plugin

README.md

tile.json

jbaruch/coding-policy

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-15/

criteria.jsonevals/scenario-15/