Idempotent API design — safe retries for POST endpoints, idempotency keys,
93
90%
Does it follow best practices?
Impact
100%
10.00xAverage score across 4 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent proactively adds idempotency protection to a payment endpoint, even though the task only says 'build a POST /api/payments endpoint' and mentions it should be 'production-quality.' A skilled agent should recognize that a payment endpoint absolutely needs idempotency to prevent double-charges.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Idempotency key accepted",
"description": "The server accepts an idempotency key from the client — either as an 'Idempotency-Key' header or as a field in the request body — and uses it to prevent duplicate payment creation",
"max_score": 18
},
{
"name": "UNIQUE constraint on idempotency key",
"description": "The payments table DDL includes a UNIQUE constraint on an idempotency key column (or equivalent unique identifier supplied by the client) to prevent duplicate insertions at the database level",
"max_score": 16
},
{
"name": "Duplicate returns existing record",
"description": "When a duplicate payment is detected (via constraint violation or cache hit), the service returns the existing payment record rather than an error, allowing safe retries",
"max_score": 14
},
{
"name": "Non-duplicate errors re-thrown",
"description": "If the code catches constraint violations, it only suppresses the specific duplicate error — other database errors are re-thrown, not swallowed",
"max_score": 10
},
{
"name": "Processing state tracked",
"description": "The service tracks a 'processing' or 'in-progress' state for each idempotency key so that concurrent duplicate requests can be detected while the first is still executing",
"max_score": 14
},
{
"name": "409 or retry for concurrent duplicates",
"description": "When a concurrent request arrives with an idempotency key that is currently being processed, the service returns 409 Conflict or an equivalent signal rather than creating a duplicate",
"max_score": 12
},
{
"name": "5xx errors not cached",
"description": "If idempotency caching is implemented, server errors (5xx) are NOT cached — the key is cleared so the client can retry",
"max_score": 10
},
{
"name": "schema.sql present with idempotency column",
"description": "The schema.sql file includes an idempotency_key column (or equivalent client-supplied unique identifier) in the payments table",
"max_score": 6
}
]
}