Audit and build the infrastructure a repo needs so agents can work autonomously — boot scripts, smoke tests, CI/CD gates, dev environment setup, observability, and isolation. Use when a repo can't boot, tests are broken or missing, there's no dev environment, agents can't verify their work, or agents need human help to get anything done. Do not use for reviewing an existing diff or for documentation-only cleanup.
97
100%
Does it follow best practices?
Impact
87%
1.03xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent adds a correctly structured JSON health endpoint and structured JSON request logging (observability), and implements worktree-safe port isolation using directory-path-derived ports with a proper boot poll loop — following the patterns described in the agent-readiness skill.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Health endpoint exists",
"description": "app.py defines a GET /health route",
"max_score": 7
},
{
"name": "Health response has status field",
"description": "The /health endpoint returns a JSON object containing a 'status' key (e.g., {\"status\": \"ok\", ...})",
"max_score": 8
},
{
"name": "Health response has version or uptime",
"description": "The /health JSON response includes at least one of: 'version' or 'uptime' fields",
"max_score": 7
},
{
"name": "Structured JSON request logging",
"description": "app.py emits a structured JSON log line for each request (not plain print statements) — containing at minimum method, path, and status fields",
"max_score": 9
},
{
"name": "Port derived from directory path",
"description": "worktree-start.sh computes the PORT value using the current working directory ($PWD or equivalent) — not a hardcoded port number",
"max_score": 10
},
{
"name": "No hardcoded port 5000",
"description": "worktree-start.sh does NOT use a hardcoded port 5000 (or any other fixed port number)",
"max_score": 8
},
{
"name": "Background app start in worktree-start.sh",
"description": "worktree-start.sh starts the Flask application in the background (using & or equivalent)",
"max_score": 8
},
{
"name": "Poll loop in worktree-start.sh",
"description": "worktree-start.sh polls the health endpoint in a loop (at least 10 iterations) before declaring success",
"max_score": 9
},
{
"name": "Non-zero exit on boot failure",
"description": "worktree-start.sh exits with a non-zero status if the app fails to start within the polling period",
"max_score": 8
},
{
"name": "Strict mode in worktree-start.sh",
"description": "worktree-start.sh includes set -euo pipefail",
"max_score": 7
},
{
"name": "teardown.sh stops service",
"description": "teardown.sh stops the Flask service (kills the process by PID or uses a stop command)",
"max_score": 7
},
{
"name": "Port collision rationale documented",
"description": "observability-notes.md explains how the port derivation prevents collisions between different worktrees on the same machine",
"max_score": 7
},
{
"name": "Dev environment observability",
"description": "observability-notes.md or app.py adds observability to the development/local environment (not just describing it as a production concern)",
"max_score": 5
}
]
}