WebSocket vs SSE vs polling, reconnection with backoff and jitter, heartbeats, backpressure, message ordering, connection state UI, auth on upgrade, graceful degradation
94
98%
Does it follow best practices?
Impact
90%
1.87xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent proactively handles real-time reliability patterns when building a bidirectional chat system. The task does NOT mention reconnection, heartbeats, backpressure, message ordering, connection state UI, or authentication. The agent should add these proactively.",
"type": "weighted_checklist",
"checklist": [
{
"name": "reconnection-with-backoff",
"description": "Client reconnects automatically after connection loss using exponential backoff with jitter (Socket.IO config with reconnectionDelay/reconnectionDelayMax/randomizationFactor, or manual backoff for raw WebSocket). The agent was NOT asked about reconnection.",
"max_score": 12
},
{
"name": "state-recovery-on-reconnect",
"description": "After reconnection, the client re-joins chat rooms and re-fetches recent messages to fill any gaps from the disconnect period. The agent was NOT asked about state recovery.",
"max_score": 10
},
{
"name": "connection-state-ui",
"description": "The UI shows a visible indicator when the connection is lost or reconnecting (banner, toast, or status text). Uses role='status' or aria-live for accessibility. The agent was NOT asked about connection state UI.",
"max_score": 10
},
{
"name": "heartbeat-or-ping-configured",
"description": "Server configures heartbeat/ping interval (Socket.IO pingInterval/pingTimeout, or manual ping for raw WebSocket) to detect dead connections. The agent was NOT asked about heartbeats.",
"max_score": 8
},
{
"name": "auth-during-handshake",
"description": "WebSocket/Socket.IO connection authenticates during the handshake (io.use() middleware or token in handshake.auth), not after. The agent was NOT asked about authentication on the WebSocket layer.",
"max_score": 8
},
{
"name": "room-based-messaging",
"description": "Messages are scoped to chat rooms (socket.join/io.to) rather than broadcast to all connected clients",
"max_score": 8
},
{
"name": "message-ordering",
"description": "Messages include timestamps or sequence numbers, and the UI sorts or validates message order rather than trusting arrival order. The agent was NOT asked about message ordering.",
"max_score": 8
},
{
"name": "message-deduplication",
"description": "Client deduplicates messages using message IDs to prevent showing the same message twice after reconnection. The agent was NOT asked about deduplication.",
"max_score": 6
},
{
"name": "client-cleanup-on-disconnect",
"description": "Server removes disconnected clients from its tracking data structures (Map/Set cleanup on disconnect event)",
"max_score": 6
},
{
"name": "authorization-on-room-join",
"description": "When a client joins a chat room, the server verifies the user is authorized to access that chat session. The agent was NOT asked about authorization.",
"max_score": 6
},
{
"name": "dispose-cleanup",
"description": "Client hook/component returns a cleanup function that disconnects the socket on unmount to prevent memory leaks and zombie connections",
"max_score": 4
},
{
"name": "websocket-chosen-over-sse",
"description": "WebSocket (or Socket.IO) is chosen for this bidirectional chat use case, not SSE or polling",
"max_score": 4
},
{
"name": "typescript-types-defined",
"description": "Message, Chat, and related types are defined with TypeScript interfaces. No use of 'any' for message payloads.",
"max_score": 4
},
{
"name": "backpressure-awareness",
"description": "Server checks bufferedAmount or write backpressure before sending to clients, or at minimum does not blindly broadcast to all clients without readyState checks. The agent was NOT asked about backpressure.",
"max_score": 6
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
skills
realtime-web-patterns
verifiers