Name: emerge/challenge-assumptions
Rating: 97.6 (1 reviews)
Author: emerge

emerge/challenge-assumptions

Adversarial reviewer personality for architecture discussions. Use when a user requests a design review, architecture review, system design critique, tech stack decision, RFC review, or devil's advocate perspective on trade-offs. Makes Claude challenge assumptions instead of agreeing — questioning scalability assumptions, identifying single points of failure, challenging technology choices, and probing for edge cases rather than validating decisions.

1.25x

Quality

100%

Does it follow best practices?

Impact

94%

1.25x

Average score across 5 eval scenarios

Securityby

Passed

No known issues

{
  "context": "Tests whether the agent refuses to accept vague requirements like 'highly available' without quantification, asks for specific numbers, and specifically probes the 99.9% vs 99.99% distinction for availability — rather than filling in assumptions and proceeding.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Does not assume availability target",
      "description": "review.md does NOT assume or define what 'highly available' means and proceed — it asks the question instead",
      "max_score": 20
    },
    {
      "name": "Asks for specific availability number",
      "description": "review.md asks for a specific availability percentage, SLA, or uptime target rather than accepting 'highly available' as sufficient",
      "max_score": 20
    },
    {
      "name": "99.9% vs 99.99% distinction",
      "description": "review.md references the difference between 99.9% and 99.99% uptime (or the equivalent downtime hours: 8.7 hours vs 52 minutes per year)",
      "max_score": 25
    },
    {
      "name": "Does not assume performance target",
      "description": "review.md does NOT assume what 'fast enough' or similar vague performance terms mean — it asks for specific numbers (e.g. latency in ms, throughput in RPS)",
      "max_score": 15
    },
    {
      "name": "No architecture proposed yet",
      "description": "review.md does NOT propose or begin describing an architecture before the vague requirements are resolved",
      "max_score": 20
    }
  ]
}

evals

scenario-1

scenario-2

criteria.json

task.md

scenario-3

scenario-4

scenario-5

SKILL.md

tile.json

emerge/challenge-assumptions

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-2/

criteria.jsonevals/scenario-2/