Subjects every non-trivial decision to a fresh-context adversarial review before it stands. Use when correctness matters more than speed, when working in unfamiliar code, when stakes are high (production, security-sensitive logic, irreversible operations), or any time a confident output would be cheaper to verify now than to debug later.
54
60%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/doubt-driven-development/SKILL.mdQuality
Discovery
57%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description has a strong 'Use when' clause with multiple explicit trigger scenarios, which is its main strength. However, it suffers from vague, abstract language about what the skill actually does—'adversarial review' is not a concrete action. The description reads more like a philosophy or methodology than a skill with specific capabilities, making it hard to distinguish from general code review or verification skills.
Suggestions
Replace 'fresh-context adversarial review' with specific concrete actions the skill performs, e.g., 'Creates a structured checklist of potential failure modes, re-examines assumptions, and stress-tests edge cases in code changes'.
Add natural trigger terms users would actually say, such as 'double-check', 'verify', 'sanity check', 'review my approach', 'catch mistakes', or 'second opinion'.
Clarify what distinguishes this from a general code review skill—specify the unique mechanism (e.g., does it re-derive solutions from scratch, use a specific framework, produce a specific artifact?).
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses abstract language like 'non-trivial decision' and 'fresh-context adversarial review' without listing concrete actions. There are no specific capabilities described—no mention of what format, tool, or output is produced. 'Adversarial review' is vague and could mean many things. | 1 / 3 |
Completeness | The description explicitly answers both 'what' (subjects decisions to adversarial review) and 'when' (correctness matters more than speed, unfamiliar code, high stakes, production, security-sensitive logic, irreversible operations). The 'Use when' clause is detailed and covers multiple trigger scenarios. | 3 / 3 |
Trigger Term Quality | Some relevant trigger terms exist like 'production', 'security-sensitive logic', 'irreversible operations', 'unfamiliar code', and 'correctness'. However, these are more contextual descriptors than natural keywords a user would say. A user is unlikely to say 'subject my decision to adversarial review'—they'd more likely say 'review my code', 'double-check this', or 'verify this logic'. | 2 / 3 |
Distinctiveness Conflict Risk | The concept of 'adversarial review' is somewhat distinctive, but the triggers are broad enough ('correctness matters', 'unfamiliar code', 'high stakes') that this could easily conflict with code review skills, testing skills, or general verification/validation skills. The description doesn't carve out a clear enough niche. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability and workflow clarity — the 5-step doubt cycle is well-structured, concrete, and includes proper validation checkpoints and feedback loops. However, it is significantly over-length, with substantial redundancy across sections (rationalizations, red flags, and verification all restate the same points) and excessive detail on edge cases like cross-model CLI invocation that could be extracted to supporting files. The content would benefit greatly from aggressive trimming and splitting into a concise core + reference files.
Suggestions
Extract the cross-model escalation subsection (Steps 1-4 under Step 3) into a separate reference file like CROSS_MODEL.md, keeping only a 2-line summary and link in the main skill.
Remove or drastically compress the 'Common Rationalizations' table — most entries restate constraints already established in the process steps and red flags. Claude doesn't need motivational rebuttals.
Consolidate the Red Flags list and Verification checklist, which overlap significantly (e.g., 'don't pass the CLAIM' appears in both, plus in Steps 2 and 3). Keep one authoritative list.
Move 'Interaction with Other Skills' to a brief inline note or separate file — Claude can infer complementarity from skill descriptions without 5 bullet points explaining it.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~300+ lines. While the topic is complex, there is significant redundancy: the 'Common Rationalizations' table repeats points already made in the body, the cross-model escalation section is disproportionately long with excessive edge-case handling, and many points are restated multiple times (e.g., 'don't pass the CLAIM' appears in Steps 2, 3, Red Flags, and Verification). Claude doesn't need explanations like why confidence correlates poorly with correctness. | 1 / 3 |
Actionability | The skill provides highly concrete, executable guidance: a copy-paste checklist, exact adversarial prompt text, specific bash invocation examples for cross-model tools, precise classification categories with precedence ordering, and clear stop conditions. The CLAIM format, EXTRACT criteria, and RECONCILE classification scheme are all immediately actionable. | 3 / 3 |
Workflow Clarity | The 5-step process is clearly sequenced with explicit validation checkpoints. Step 5 provides bounded stop conditions (trivial findings, 3 cycles, or user override), Step 4 has a clear classification precedence order, and there are explicit feedback loops (fix and re-loop for actionable findings, fix contract and re-classify for contract misreads). The checklist format makes the workflow trackable. | 3 / 3 |
Progressive Disclosure | The skill references external files (agents/, references/orchestration-patterns.md, other skills) appropriately, but the main body itself is monolithic — the cross-model escalation subsection alone could be a separate reference file. The Common Rationalizations table, Red Flags list, and Interaction with Other Skills section add significant bulk that could be split out, keeping the core SKILL.md focused on the 5-step process. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
f17c6e8
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.