Challenge AI output with structured devil's-advocate protocols: anchor, verify, framing, and deep sub-commands.
86
86%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Apply structured provocation patterns to force reconsideration of current work.
Target: $ARGUMENTS
CRITICAL: After EVERY AskUserQuestion call, check if answers are empty/blank. Known Claude Code bug: outside Plan Mode, AskUserQuestion silently returns empty answers without showing UI.
If answers are empty: DO NOT proceed with assumptions. Instead:
Parse first word of $ARGUMENTS as subcommand:
| Subcommand | Error Type | Protocol |
|---|---|---|
anchor | Premature commitment / anchoring bias | Read references/protocols/anchor.md → execute |
verify | Factual errors / hallucination | Read references/protocols/verify.md → execute |
framing | Wrong problem / framing errors | Read references/protocols/framing.md → execute |
deep | High stakes — all 9 patterns in fresh context | Spawn devil-advocate sub-agent via Agent |
If no subcommand detected:
AskUserQuestion: "What are you worried about with the current AI response?"
→ Dispatch to matching subcommand based on answer.
Spawn via Agent tool a devil's advocate sub-agent with:
For every finding, make reasoning explicit:
references/reference.md pattern catalogAll subcommands produce a Challenge Report (structured, not prose).
See references/reference.md for report format and pattern catalog.
anchor to surface premature commitment.verify to stress-test accuracy.framing to question whether the right problem is being solved.deep to run all nine patterns in a clean context.deep on a trivial or low-stakes target — All nine patterns applied to a minor decision wastes cognitive bandwidth. Why: deep is calibrated for high-stakes irreversible choices; applying it broadly devalues the signal and desensitises the team.Challenging a proposed architecture decision:
# Proposal: "Use a monorepo for all 40 services"
# Skill applies anchor: "What problem does this solve? What is the current pain?"
# Skill applies framing: "What alternatives were considered? What are the failure modes?"
# Skill applies verify: "Can we test this with 3 services first?"Challenging an implementation shortcut:
# Proposal: "Just hardcode the config values for the demo"
# Skill challenges: "What is the rollback path? When does this become permanent?"
# Output: Structured challenge with anchor + alternative approachRunning a full devil's advocate review before a design doc is signed off:
# Target: architecture decision record for event-sourcing migration
# Invoke: /challenge deep path/to/adr.md
# Skill spawns a sub-agent with all 9 patterns (anchor x4, verify x3, framing x2)
# Output: Challenge Report with confidence ratings and ranked findings