Use when challenging ideas, plans, decisions, or proposals. Invoke to play devil's advocate, run a pre-mortem, red team, stress test assumptions, audit evidence quality, or find blind spots before committing. Do NOT use for building plans, making decisions, or generating solutions — this skill only challenges and critiques.
94
92%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines its purpose, provides rich trigger terms, and explicitly delineates its boundaries. The inclusion of both positive triggers ('Use when...') and negative boundaries ('Do NOT use for...') makes it exceptionally easy for Claude to select correctly. The only minor note is that the 'what' is somewhat embedded within the trigger clause rather than stated separately upfront, but the overall clarity is strong.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'play devil's advocate, run a pre-mortem, red team, stress test assumptions, audit evidence quality, find blind spots.' Also explicitly states what it does NOT do, adding further specificity. | 3 / 3 |
Completeness | Clearly answers both 'what' (challenges ideas, critiques, stress tests assumptions, audits evidence) and 'when' (when challenging ideas, plans, decisions, or proposals). The explicit 'Use when...' clause and 'Do NOT use for...' anti-triggers make this highly complete. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'devil's advocate', 'pre-mortem', 'red team', 'stress test', 'blind spots', 'challenge', 'critique'. These are terms users naturally use when seeking critical feedback on their ideas. | 3 / 3 |
Distinctiveness Conflict Risk | The explicit 'Do NOT use for building plans, making decisions, or generating solutions' boundary clearly separates this from planning, decision-making, or solution-generation skills. The niche of pure critique/challenge is well-defined and unlikely to conflict. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, well-structured skill with excellent workflow clarity and progressive disclosure. The 5-step process is concrete, the example is detailed and illustrative, and reference files are appropriately used for mode-specific details. Minor conciseness improvements could be made by trimming the framework name-dropping in the preamble and reducing some redundancy between the workflow steps and the constraints section.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some unnecessary framing (e.g., the 'court jester' metaphor, listing all the frameworks Claude has expertise in). The constraints section has some redundancy with earlier workflow steps. However, most content earns its place. | 2 / 3 |
Actionability | The workflow is highly concrete with specific steps, a detailed example showing exact outputs for each step, clear mode selection tables, and explicit reference file paths. The example with the microservices migration is fully fleshed out and demonstrates exactly what each step produces. | 3 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced with explicit validation checkpoints: confirm steelman with user (Step 1), structured mode selection via AskUserQuestion (Step 2), ask user to respond before synthesizing (Step 4), and offer second pass after synthesis (Step 5). The feedback loop of challenge → engage → synthesize is well-defined. | 3 / 3 |
Progressive Disclosure | The skill provides a clear overview workflow in the main file while appropriately delegating detailed mode-specific methods to one-level-deep reference files (e.g., references/socratic-questioning.md, references/pre-mortem-analysis.md). Navigation is well-signaled through the reference table and inline file paths. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
906a57d
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.