Creates boundary-point validation contracts, defines invariant-based success criteria, and sets up automated verification probes so reliability workflows trigger on objective evidence rather than intuition. Use when designing robust handoff, memory-persistence, or tool-call reliability workflows; when you need to verify handoffs work, check memory persistence, validate tool calls succeeded, or convert vague reliability goals into concrete, testable checks at each boundary point with explicit failure-class mapping (operational vs. critical); or when you want to test your workflow end-to-end, make sure it works, or verify your automation runs correctly using read-back probes and escalation triggers rather than agent confidence. Includes explicit untrusted-content/prompt-injection guardrails for third-party inputs.
96
Quality
90%
Does it follow best practices?
Impact
98%
1.25xAverage score across 9 eval scenarios
An AI customer service system stores user conversation context in a JSON-based key-value store. When a user returns for a follow-up session, the agent is supposed to "remember" prior preferences, issue history, and open action items. Engineering has observed that some sessions start cold — the agent answers as if no prior conversation occurred — even though no errors appear in the application logs. The suspected cause is a race condition where stale or deserialisation-failed cache entries are silently ignored.
The team needs a detectability contract for the memory-resume boundary so that every time the agent attempts to restore session context, the restoration is validated before the session begins. If the context is unrestorable for any reason, the system should know exactly why and what to do about it, rather than silently defaulting to an empty session.
Produce the following files:
contract.md — A boundary contract document covering the memory-resume stage of this workflow, with all required columns filled in. Include the specific verification probes that are appropriate for checking memory persistence.resume_check.py — A Python script that accepts a JSON file path as a command-line argument. The script should load the file, check that the required session keys are present (session_id, user_id, context, timestamp), verify the timestamp is fresh (within 5 minutes of current time), and print a structured report to stdout indicating which checks passed and which failed.The script should be runnable with: python resume_check.py <path_to_session_file>