Use when work needs to be handed off to another agent or another human. Produce a continuation-ready brief with the objective, completed work, assumptions, unresolved issues, and next action instead of a generic summary. Good triggers include "prepare a handoff", "make this resumable", and "summarize this for another agent".
92
100%
Does it follow best practices?
Impact
89%
1.41xAverage score across 8 eval scenarios
Passed
No known issues
{
"context": "The agent must produce a handoff for a partial test suite build. This scenario tests whether the agent clearly states the overall objective and what has been completed — and whether these sections are distinct, factual, and appropriately brief.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Objective section present",
"description": "The document has a section explicitly labelled 'Objective' (or equivalent heading).",
"max_score": 8
},
{
"name": "Objective states the goal",
"description": "The Objective section states the overarching goal (e.g. achieving a specific coverage target across the three payment flows), not just 'write more tests'.",
"max_score": 10
},
{
"name": "Completed section present",
"description": "The document has a section explicitly labelled 'Completed' (or equivalent heading).",
"max_score": 8
},
{
"name": "Completed reflects actual done work",
"description": "The Completed section lists only work that is genuinely done (checkout tests, refund tests, CI integration) and does not include in-progress or future items.",
"max_score": 10
},
{
"name": "Assumptions section present",
"description": "The document has an Assumptions section with at least one stated assumption.",
"max_score": 7
},
{
"name": "Assumptions section covers mock requirement",
"description": "The Assumptions section or another appropriate section explicitly notes that BillingProvider must be mocked in tests.",
"max_score": 12
},
{
"name": "Unresolved section present",
"description": "The document has an Unresolved section listing open items or gaps.",
"max_score": 7
},
{
"name": "Next Action section present",
"description": "The document has a Next Action section.",
"max_score": 7
},
{
"name": "Next action is concrete",
"description": "The Next Action identifies a specific starting point (a file, flow, or test to write first) rather than saying 'continue testing'.",
"max_score": 10
},
{
"name": "Critical References section present",
"description": "The document has a Critical References section with at least one entry.",
"max_score": 7
},
{
"name": "All six sections present",
"description": "The document contains all six prescribed sections: Objective, Completed, Assumptions, Unresolved, Next Action, Critical References.",
"max_score": 7
},
{
"name": "No implementation explanation",
"description": "The Completed section does NOT explain how the tests work; it states what was done (bullet points at most), keeping detail to the minimum needed for continuity.",
"max_score": 7
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
skills
compact-handoff