Self-referential loop until task completion with configurable verification reviewer
35
31%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/ralph/SKILL.mdQuality
Discovery
0%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is extremely vague and uses abstract, jargon-heavy language that fails to communicate what the skill actually does, what domain it applies to, or when it should be used. It lacks concrete actions, natural trigger terms, and any explicit 'use when' guidance, making it nearly impossible for Claude to correctly select this skill from a pool of available options.
Suggestions
Replace abstract language with concrete actions describing what the skill does (e.g., 'Iteratively executes and refines code/output until acceptance criteria are met').
Add a 'Use when...' clause with natural trigger terms users would say, such as 'Use when the user asks to iterate on a task, retry until correct, or loop until done'.
Specify the domain or type of task this applies to (e.g., code generation, data processing, writing) to reduce conflict risk with other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language. 'Self-referential loop' and 'configurable verification reviewer' are not concrete actions—they describe an abstract mechanism without specifying what task is being completed or what domain it applies to. | 1 / 3 |
Completeness | The description vaguely hints at 'what' (some kind of iterative loop with verification) but is extremely unclear, and there is no 'when' clause or any explicit trigger guidance for when Claude should select this skill. | 1 / 3 |
Trigger Term Quality | The terms used ('self-referential loop', 'configurable verification reviewer') are technical jargon that no user would naturally say when requesting help. There are no natural keywords that would help Claude match this skill to a user request. | 1 / 3 |
Distinctiveness Conflict Risk | The description is so vague and abstract that it could apply to virtually any iterative or verification-based workflow. It provides no clear niche or distinct triggers to differentiate it from other skills. | 1 / 3 |
Total | 4 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a highly actionable and well-structured workflow skill with excellent step sequencing, validation checkpoints, and concrete examples. However, it is significantly over-engineered in verbosity — the content could likely be cut by 40-50% without losing any actionable information. Sections like 'Why_This_Exists', extensive anti-pattern documentation, and repeated warnings about the same failure modes (polite-stop anti-pattern mentioned 3 times) inflate the token cost substantially.
Suggestions
Cut the 'Why_This_Exists' section entirely — Claude doesn't need motivation for following instructions, and the workflow steps already encode the rationale implicitly.
Consolidate the repeated polite-stop anti-pattern warnings (appears in Steps, Escalation, and implicitly in Examples) into a single prominent callout in the Steps section.
Move the 'Parallel session caveats', 'Advanced' background execution rules, and detailed Examples into separate bundle files, keeping only a one-line reference in the main SKILL.md.
Remove explanatory 'Why good/Why bad' annotations from examples — the good/bad labels and the examples themselves are self-explanatory to Claude.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~250+ lines. It over-explains concepts like why tasks fail silently, includes lengthy rationale sections ('Why_This_Exists'), explains what PRD mode is, and contains extensive anti-pattern documentation. Much of this (parallel execution, agent tiers, background vs foreground) is operational knowledge that could be drastically condensed. The 'Do_Not_Use_When' and 'Use_When' sections, while useful, are padded with unnecessary detail. | 1 / 3 |
Actionability | The skill provides highly concrete, actionable guidance: specific tool invocation syntax (Task, Skill), exact file paths (.omc/state/sessions/{sessionId}/prd.json), explicit CLI flags (--critic=architect, --no-deslop), concrete examples of good/bad PRD criteria, and precise agent tier selection rules. The examples section shows executable delegation patterns and verification workflows. | 3 / 3 |
Workflow Clarity | The workflow is exceptionally well-sequenced with numbered steps (1-9 including 7.5 and 7.6), explicit validation checkpoints at multiple stages (story verification, reviewer verification, regression re-verification), clear feedback loops (rejection → fix → re-verify), and a comprehensive final checklist. The anti-pattern warnings about 'polite-stop' after Step 7 show careful attention to failure modes. | 3 / 3 |
Progressive Disclosure | The skill references external files (docs/shared/agent-tiers.md, docs/company-context-interface.md, docs/REFERENCE.md) which is good progressive disclosure, but the main body itself is monolithic — the Advanced section, Examples, Escalation conditions, and Parallel session caveats could all be split into separate reference files. No bundle files are provided to verify the referenced paths exist. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
3e94567
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.