Name: sharaf/agent-prompt-engineer
Rating: 100 (1 reviews)
Author: sharaf

sharaf/agent-prompt-engineer

Write or audit AI agent system prompts component-by-component across identity, instruction architecture, behavioral constraints, tools, examples, context strategy, output format, and error handling. Use when the user wants to design a new agent prompt, write a system prompt, review an existing agent prompt, fix tool-use instructions, audit prompt structure, improve context strategy, tune output formats, or define error handling for single-agent or multi-agent systems.

100

1.33x

Quality

100%

Does it follow best practices?

Impact

100%

1.33x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

{
  "context": "Tests whether the skill handles multi-agent prompt design with role boundaries, typed handoffs, context strategy, verifier rubrics, output contracts, and error handling.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "First Actions and mode",
      "description": "The response includes a filled First Actions section or equivalent that states write mode and identifies this as a multi-agent prompt architecture task.",
      "max_score": 8
    },
    {
      "name": "Agent-specific prompts",
      "description": "The response defines separate prompt guidance or prompt blocks for the orchestrator, web research worker, CRM worker, and verifier.",
      "max_score": 10
    },
    {
      "name": "Role boundaries",
      "description": "Each agent has explicit scope boundaries, including what it does and what it must hand off rather than do itself.",
      "max_score": 10
    },
    {
      "name": "Orchestrator constraints",
      "description": "The orchestrator prompt focuses on decomposition, routing, synthesis, and conflict resolution, and prevents it from doing domain research directly.",
      "max_score": 8
    },
    {
      "name": "Worker contracts",
      "description": "The worker prompts include task scope, source expectations, output fields, uncertainty handling, and tool-result handling.",
      "max_score": 8
    },
    {
      "name": "Verifier rubric",
      "description": "The verifier prompt includes concrete pass/fail or readiness criteria for source coverage, unsupported claims, role compliance, and final brief quality.",
      "max_score": 10
    },
    {
      "name": "Typed handoff schema",
      "description": "The response defines typed handoff schemas or structured fields at boundaries rather than relying only on free-form natural language summaries.",
      "max_score": 10
    },
    {
      "name": "Context isolation",
      "description": "The response explains how to keep each agent's context scoped, avoid forwarding full history unnecessarily, and pass focused summaries or artifacts.",
      "max_score": 8
    },
    {
      "name": "Unsupported claim handling",
      "description": "The prompts or verifier rules require unsupported claims to be removed, marked as uncertain, or sent back for more evidence.",
      "max_score": 8
    },
    {
      "name": "Error handling",
      "description": "The design classifies recoverable and unrecoverable failures or gives deterministic recovery paths for missing CRM data, empty web results, tool errors, and conflicting evidence.",
      "max_score": 8
    },
    {
      "name": "Implementation-ready output",
      "description": "The response includes enough concrete prompt text, schemas, and checkpoints for engineering implementation, not only high-level architecture advice.",
      "max_score": 8
    },
    {
      "name": "How to Iterate",
      "description": "The response includes evaluation or iteration guidance for testing the multi-agent workflow with golden cases, verifier checks, or one-variable prompt changes.",
      "max_score": 4
    }
  ]
}

evals

scenario-1

scenario-2

scenario-3

sharaf/agent-prompt-engineer

criteria.json.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}evals/scenario-3/

criteria.jsonevals/scenario-3/