Write or audit AI agent system prompts component-by-component across identity, instruction architecture, behavioral constraints, tools, examples, context strategy, output format, and error handling. Use when the user wants to design a new agent prompt, write a system prompt, review an existing agent prompt, fix tool-use instructions, audit prompt structure, improve context strategy, tune output formats, or define error handling for single-agent or multi-agent systems.
100
100%
Does it follow best practices?
Impact
100%
1.33xAverage score across 3 eval scenarios
Passed
No known issues
{
"context": "Tests whether the skill handles multi-agent prompt design with role boundaries, typed handoffs, context strategy, verifier rubrics, output contracts, and error handling.",
"type": "weighted_checklist",
"checklist": [
{
"name": "First Actions and mode",
"description": "The response includes a filled First Actions section or equivalent that states write mode and identifies this as a multi-agent prompt architecture task.",
"max_score": 8
},
{
"name": "Agent-specific prompts",
"description": "The response defines separate prompt guidance or prompt blocks for the orchestrator, web research worker, CRM worker, and verifier.",
"max_score": 10
},
{
"name": "Role boundaries",
"description": "Each agent has explicit scope boundaries, including what it does and what it must hand off rather than do itself.",
"max_score": 10
},
{
"name": "Orchestrator constraints",
"description": "The orchestrator prompt focuses on decomposition, routing, synthesis, and conflict resolution, and prevents it from doing domain research directly.",
"max_score": 8
},
{
"name": "Worker contracts",
"description": "The worker prompts include task scope, source expectations, output fields, uncertainty handling, and tool-result handling.",
"max_score": 8
},
{
"name": "Verifier rubric",
"description": "The verifier prompt includes concrete pass/fail or readiness criteria for source coverage, unsupported claims, role compliance, and final brief quality.",
"max_score": 10
},
{
"name": "Typed handoff schema",
"description": "The response defines typed handoff schemas or structured fields at boundaries rather than relying only on free-form natural language summaries.",
"max_score": 10
},
{
"name": "Context isolation",
"description": "The response explains how to keep each agent's context scoped, avoid forwarding full history unnecessarily, and pass focused summaries or artifacts.",
"max_score": 8
},
{
"name": "Unsupported claim handling",
"description": "The prompts or verifier rules require unsupported claims to be removed, marked as uncertain, or sent back for more evidence.",
"max_score": 8
},
{
"name": "Error handling",
"description": "The design classifies recoverable and unrecoverable failures or gives deterministic recovery paths for missing CRM data, empty web results, tool errors, and conflicting evidence.",
"max_score": 8
},
{
"name": "Implementation-ready output",
"description": "The response includes enough concrete prompt text, schemas, and checkpoints for engineering implementation, not only high-level architecture advice.",
"max_score": 8
},
{
"name": "How to Iterate",
"description": "The response includes evaluation or iteration guidance for testing the multi-agent workflow with golden cases, verifier checks, or one-variable prompt changes.",
"max_score": 4
}
]
}