Set up or align a GitHub Actions deploy pipeline for an app or service. Use when standardizing repos around the verify-then-deploy shape: push to main → detect affected lanes → verify and build artifacts → e2e → deploy each lane to its host (Cloudflare Pages, AWS Amplify, GHCR + VPS).
99
100%
Does it follow best practices?
Impact
97%
1.21xAverage score across 4 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent follows the verify → e2e → deploy pipeline topology without collapsing stages, passes the same artifact through all stages without rebuilding, and adds a post-deploy smoke check. Also checks artifact upload options and step summary.",
"type": "weighted_checklist",
"checklist": [
{
"name": "No rebuild in deploy",
"description": "The deploy job downloads the artifact from the verify/build job rather than running a build command itself (no `npm run build`, `pnpm run build`, etc. inside the deploy job)",
"max_score": 12
},
{
"name": "Artifact downloaded in e2e",
"description": "The e2e job uses actions/download-artifact to obtain the built artifact, rather than running a fresh build",
"max_score": 10
},
{
"name": "Upload version v7",
"description": "Uses actions/upload-artifact@v7 (not v3, v4, or other versions) in the verify/build job",
"max_score": 8
},
{
"name": "if-no-files-found error",
"description": "The artifact upload step includes `if-no-files-found: error` to fail if the build produced no output",
"max_score": 10
},
{
"name": "include-hidden-files",
"description": "The artifact upload step includes `include-hidden-files: true` to capture framework output directories like .next/ or .output/",
"max_score": 8
},
{
"name": "Unique artifact name",
"description": "The artifact is uploaded with a lane-specific name (e.g. `web-dist`) rather than a generic name like `dist` or `build`",
"max_score": 7
},
{
"name": "Separate stages",
"description": "Verify, e2e, and deploy are separate jobs (not collapsed into a single job), each listed under `jobs:` independently",
"max_score": 10
},
{
"name": "Smoke step present",
"description": "A smoke or health-check step exists in the deploy job that makes an HTTP request (curl, wget, or similar) to the deployed URL after deployment completes",
"max_score": 12
},
{
"name": "Smoke step fails on non-200",
"description": "The smoke step uses flags that cause it to fail on non-200 responses (e.g. `curl -fsS` or `curl -f`)",
"max_score": 8
},
{
"name": "GITHUB_STEP_SUMMARY",
"description": "A step in the deploy job writes to `$GITHUB_STEP_SUMMARY` including what was deployed and where (URL or environment)",
"max_score": 8
},
{
"name": "Deploy needs both verify and e2e",
"description": "The deploy job declares `needs: [verify-<lane>, e2e-<lane>]` (or equivalent) to depend on both upstream jobs",
"max_score": 7
}
]
}