Full autonomous execution from idea to working code
30
23%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/autopilot/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is extremely vague and provides almost no useful information for skill selection. It lacks concrete actions, natural trigger terms, explicit 'when to use' guidance, and any distinguishing characteristics that would differentiate it from other coding-related skills.
Suggestions
Replace the abstract phrase with specific concrete actions (e.g., 'Scaffolds projects, generates boilerplate code, implements features, writes tests, and iterates until code runs successfully').
Add an explicit 'Use when...' clause with natural trigger terms (e.g., 'Use when the user asks to build something from scratch, create a new project, implement a feature end-to-end, or wants fully working code without manual intervention').
Clarify the scope and niche to reduce conflict risk (e.g., specify what languages, frameworks, or project types this covers, or what distinguishes it from simpler code generation skills).
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language ('full autonomous execution', 'idea to working code') without listing any concrete actions like generating files, running tests, or setting up projects. | 1 / 3 |
Completeness | The 'what' is extremely vague (no specific capabilities listed) and there is no 'when' clause or explicit trigger guidance whatsoever. | 1 / 3 |
Trigger Term Quality | No natural keywords a user would actually say. Terms like 'autonomous execution' and 'idea to working code' are abstract concepts, not trigger terms users would use in requests. | 1 / 3 |
Distinctiveness Conflict Risk | Extremely generic — 'idea to working code' could apply to virtually any coding skill, making it highly likely to conflict with other code-related skills. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
47%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill has excellent workflow clarity with well-defined phases, validation checkpoints, and escalation conditions, making the multi-step orchestration process clear. However, it suffers from significant verbosity — the same concepts (especially the 3-stage pipeline and deep-interview integration) are repeated multiple times, and sections like 'Why_This_Exists' add little value for Claude. Actionability is moderate: while file paths and tool patterns are specified, there are no truly executable code examples.
Suggestions
Eliminate redundant explanations of the 3-stage pipeline — it's described in Phase 0 skip logic, Phase 1 skip logic, and again in the Advanced section. Consolidate to one location.
Remove the 'Why_This_Exists' section entirely — Claude doesn't need motivation for following instructions, and the purpose is already clear from the steps.
Trim 'Use_When' and 'Do_Not_Use_When' to just the trigger phrases and routing rules without explanatory comments after each bullet.
Move the Advanced section content (configuration, troubleshooting, deep-interview integration details) into referenced files to reduce the main skill's token footprint.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose with significant redundancy. The 3-stage pipeline explanation is repeated multiple times (in Phase 0, Phase 1, and the Advanced section). Sections like 'Why_This_Exists' and 'Use_When'/'Do_Not_Use_When' explain routing logic Claude could infer. The deep-interview integration is explained three separate times. Many XML-style tags add structural overhead without value. | 1 / 3 |
Actionability | The skill provides specific phase sequences, file paths (.omc/autopilot/spec.md), tool usage patterns (Task(subagent_type=...)), and configuration examples. However, there is no executable code — the 'code' shown is just config JSONC and a text flow diagram. The actual implementation of each phase relies on references to other tools (Ralph, Ultrawork, UltraQA) without concrete commands or executable examples. | 2 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced across 6 phases with explicit validation checkpoints (QA cycles with retry limits, multi-perspective validation requiring all approvals, error persistence detection at 3 cycles). Feedback loops are well-defined: QA repeats up to 5 times, validation rejections trigger fix-and-revalidate, and escalation/stop conditions are clearly enumerated. The final checklist provides verification. | 3 / 3 |
Progressive Disclosure | The skill references external docs (docs/company-context-interface.md, docs/REFERENCE.md) and other skills (deep-interview, ralplan, ralph), which is good. However, no bundle files are provided, so these references are unverifiable. The Advanced section contains substantial inline content (configuration, resume, troubleshooting, deep interview integration) that could be split into separate files. The main body is quite long for an overview. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
3e94567
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.