Audit and build the infrastructure a repo needs so agents can work autonomously — boot scripts, smoke tests, CI/CD gates, dev environment setup, observability, and isolation. Use when a repo can't boot, tests are broken or missing, there's no dev environment, agents can't verify their work, or agents need human help to get anything done. Do not use for reviewing an existing diff or for documentation-only cleanup.
99
100%
Does it follow best practices?
Impact
99%
1.17xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly articulates specific capabilities (boot scripts, smoke tests, CI/CD gates, dev environment setup, observability, isolation), provides explicit trigger conditions covering multiple failure scenarios, and includes negative boundaries to prevent misuse. The description is concise yet comprehensive, using third-person voice throughout.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: boot scripts, smoke tests, CI/CD gates, dev environment setup, observability, and isolation. These are clearly defined infrastructure components. | 3 / 3 |
Completeness | Clearly answers 'what' (audit and build infrastructure — boot scripts, smoke tests, CI/CD gates, etc.) and 'when' (repo can't boot, tests broken/missing, no dev environment, agents can't verify work). Also includes explicit 'Do not use' exclusions, which further clarifies scope. | 3 / 3 |
Trigger Term Quality | Includes natural terms users and agents would encounter: 'repo can't boot', 'tests are broken or missing', 'no dev environment', 'agents can't verify their work', 'CI/CD gates', 'smoke tests'. These map well to real scenarios a user would describe. | 3 / 3 |
Distinctiveness Conflict Risk | Occupies a clear niche around repo infrastructure for agent autonomy. The explicit exclusions ('not for reviewing an existing diff or documentation-only cleanup') actively reduce conflict risk with code review or documentation skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is an excellent skill that efficiently communicates a complex multi-layered readiness framework. It balances conciseness with actionability by providing executable examples for the most critical steps while deferring detailed patterns to reference files. The workflow is well-structured with clear sequencing, validation checkpoints, and handoff criteria.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and well-structured. It assumes Claude's competence, avoids explaining basic concepts, and every section serves a clear purpose. The 7-layer stack is presented as a concise list with minimal but sufficient examples. | 3 / 3 |
Actionability | Provides executable bash scripts for boot and enforce steps, concrete curl commands for smoke checks, specific grading criteria with example output, and a clear compact output format. The workflow steps are concrete and copy-paste ready. | 3 / 3 |
Workflow Clarity | The 4-step workflow (Audit → Setup → Improve → Hand Off) is clearly sequenced with explicit ordering within Setup (Boot → Smoke → Interact → E2e → Enforce → Observe → Isolate). Validation is built into the process via grading, the boot script includes a health check loop with failure handling, and there are clear handoff criteria (C+ grade). | 3 / 3 |
Progressive Disclosure | The skill provides a clear overview with well-signaled one-level-deep references to grading.md, setup-patterns.md, and industry-examples.md. Advanced patterns (e2e, observability, isolation, containerized stacks) are appropriately deferred to reference files rather than inlined. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents