Validate agent handoff packets and resume readiness using schema, freshness, and replay checks. Use when tasks pause/resume across sessions, agents, or humans — including when a user wants to continue where they left off, hand off to another agent, resume a previous task, or pick up an interrupted workflow. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.
96
Quality
100%
Does it follow best practices?
Impact
96%
1.50xAverage score across 9 eval scenarios
{
"context": "Evaluates whether the agent produced outputs aligned with the skill's reliability-check workflow.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Required fields",
"description": "Specifies a concrete required field set for handoff packets",
"max_score": 20
},
{
"name": "Freshness and token checks",
"description": "Includes explicit freshness and token validation checks",
"max_score": 20
},
{
"name": "Replay test",
"description": "Defines replay questions or equivalent resumability test",
"max_score": 20
},
{
"name": "Classification mapping",
"description": "Maps failure modes to clean/operational/critical",
"max_score": 20
},
{
"name": "Example outputs",
"description": "Includes both pass and fail example summaries",
"max_score": 20
}
]
}