WebSocket vs SSE vs polling, reconnection with backoff and jitter, heartbeats, backpressure, message ordering, connection state UI, auth on upgrade, graceful degradation
94
98%
Does it follow best practices?
Impact
90%
1.87xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent uses SSE (one-way updates) or WebSocket with room scoping for order-specific updates, and proactively handles reconnection, heartbeats, connection state UI, and state recovery. The task says 'pushed to the customer' but does NOT specify transport, reconnection, heartbeats, or connection state handling.",
"type": "weighted_checklist",
"checklist": [
{
"name": "appropriate-transport-choice",
"description": "Agent chooses SSE (simpler, one-way updates suffice) or WebSocket with rooms. Either is acceptable as long as the choice is reasonable. Using polling for 'real-time' order tracking is not acceptable.",
"max_score": 10
},
{
"name": "scoped-to-specific-order",
"description": "Updates are scoped to the specific order being tracked (via SSE endpoint per order, or WebSocket room per order) rather than broadcasting all order updates to all customers",
"max_score": 10
},
{
"name": "reconnection-handled",
"description": "Client handles reconnection automatically -- either EventSource (built-in) or WebSocket with exponential backoff and jitter. The agent was NOT asked about reconnection.",
"max_score": 12
},
{
"name": "state-recovery-on-reconnect",
"description": "After reconnection, client fetches the current order state to ensure the status displayed is current, not stale from before the disconnect. The agent was NOT asked about state recovery.",
"max_score": 10
},
{
"name": "connection-state-ui",
"description": "Tracking page shows a visible indicator when the connection is lost or reconnecting. Uses role='status' or aria-live for accessibility. The agent was NOT asked about connection state UI.",
"max_score": 10
},
{
"name": "heartbeat-configured",
"description": "Server sends periodic heartbeats to keep the connection alive through proxies. The agent was NOT asked about heartbeats.",
"max_score": 8
},
{
"name": "event-ids-or-sequence-numbers",
"description": "Messages include event IDs or sequence numbers so the client can detect missed updates and the server can replay them on reconnection. The agent was NOT asked about message ordering.",
"max_score": 8
},
{
"name": "client-cleanup-on-disconnect",
"description": "Server removes disconnected clients from tracking. Client hook returns cleanup function for unmount.",
"max_score": 6
},
{
"name": "proper-sse-or-ws-headers",
"description": "If SSE: correct Content-Type, Cache-Control, Connection headers. If WebSocket: proper CORS configuration.",
"max_score": 4
},
{
"name": "typescript-types-defined",
"description": "Order and status types are defined with TypeScript interfaces.",
"max_score": 4
},
{
"name": "graceful-degradation",
"description": "Client falls back to polling if the real-time transport fails to connect. The agent was NOT asked about fallback behavior.",
"max_score": 6
},
{
"name": "dead-client-detection",
"description": "Server detects and cleans up dead connections rather than accumulating references. The agent was NOT asked about dead client detection.",
"max_score": 4
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
skills
realtime-web-patterns
verifiers