Auto-generated tile from GitHub (10 skills)
92
94%
Does it follow best practices?
Impact
92%
1.16xAverage score across 44 eval scenarios
Advisory
Suggest reviewing before use
{
"context": "Tests whether the agent uses proper gh CLI flags for PR creation, avoids common pitfalls like --body flag newline escaping, includes CI monitoring via gh pr checks, and avoids AI attribution. The script should use gh CLI exclusively for all GitHub operations.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Uses --body-file flag",
"description": "The script uses --body-file (not --body) when passing PR description to gh pr create",
"max_score": 15
},
{
"name": "Explicit --base flag",
"description": "The gh pr create call includes an explicit --base flag (e.g. --base main)",
"max_score": 10
},
{
"name": "Explicit --head flag",
"description": "The gh pr create call includes an explicit --head flag referencing the feature branch",
"max_score": 10
},
{
"name": "CI watch command",
"description": "The script includes 'gh pr checks' with the '--watch' flag to monitor CI after PR creation",
"max_score": 15
},
{
"name": "Stderr redirect in CI watch",
"description": "The gh pr checks --watch call redirects stderr to stdout (2>&1) to capture full output",
"max_score": 10
},
{
"name": "Temp file for PR body",
"description": "The PR body is written to a temporary file (e.g. /tmp/ or mktemp) before being passed to --body-file",
"max_score": 10
},
{
"name": "gh CLI only",
"description": "No references to the GitHub web interface (no URLs like github.com/*/pull/new or browser-open instructions for creating PRs)",
"max_score": 10
},
{
"name": "No AI attribution",
"description": "The script and README do NOT contain 'Co-Authored-By: Claude' or any AI co-authorship marker",
"max_score": 10
},
{
"name": "PR number used for checks",
"description": "The gh pr checks command uses a PR number or identifier extracted from the gh pr create output, not a hardcoded value",
"max_score": 10
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
scenario-31
scenario-32
scenario-33
scenario-34
scenario-35
scenario-36
scenario-37
scenario-38
scenario-39
scenario-40
scenario-41
scenario-42
scenario-43
scenario-44
skills
documentation
fastify
init
linting-neostandard-eslint9
node
nodejs-core
rules
oauth
octocat
snipgrapher