Production error handling for FastAPI — exception handlers, structured error
96
96%
Does it follow best practices?
Impact
98%
6.12xAverage score across 5 eval scenarios
Passed
No known issues
{
"context": "Tests whether the agent proactively applies FastAPI error handling best practices when building a payment API. The task says nothing about exception handlers, structured error responses, or validation error formatting -- the agent should add these on its own.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Custom exception hierarchy defined",
"description": "Agent defines a base exception class (e.g. AppError) that extends HTTPException or Exception, plus specific subclasses for not-found, validation, conflict, or business-logic errors. The agent was NOT asked to create custom exceptions.",
"max_score": 12
},
{
"name": "Exception handler for custom errors registered",
"description": "Agent registers an @app.exception_handler for the custom exception class that returns a structured JSON response with a machine-readable code and human-readable message. The agent was NOT asked to register exception handlers.",
"max_score": 14
},
{
"name": "RequestValidationError handler registered",
"description": "Agent registers an @app.exception_handler(RequestValidationError) that reformats Pydantic validation errors into a structured response with field-level details, instead of relying on FastAPI's default 422 format. The agent was NOT asked to customize validation errors.",
"max_score": 14
},
{
"name": "Generic Exception handler registered",
"description": "Agent registers a catch-all @app.exception_handler(Exception) that returns a safe generic error message (like 'Internal server error') without leaking stack traces or implementation details. The agent was NOT asked to handle unexpected errors.",
"max_score": 14
},
{
"name": "Consistent structured error response format",
"description": "All error responses use the same JSON shape (e.g. {\"error\": {\"code\": \"...\", \"message\": \"...\"}} or similar), with a machine-readable error code/type and a human-readable message. No mixing of different error shapes across routes.",
"max_score": 12
},
{
"name": "No stack traces leaked in error responses",
"description": "Error responses for unexpected failures do not include Python tracebacks, file paths, or raw exception messages. A generic safe message is returned instead.",
"max_score": 10
},
{
"name": "Business logic errors use appropriate HTTP status codes",
"description": "Different error conditions use semantically correct HTTP status codes: 404 for not found, 400 or 422 for validation, 409 for conflicts (duplicate email, already refunded), 402 or 400 for insufficient balance.",
"max_score": 8
},
{
"name": "Errors are logged server-side",
"description": "Unexpected/internal errors are logged using logging, structlog, loguru, or similar before returning the safe response. The agent was NOT asked to add logging.",
"max_score": 8
},
{
"name": "Pydantic validation details include field names",
"description": "The custom RequestValidationError handler extracts and returns per-field error information (field name and error message) rather than returning a single opaque validation error string.",
"max_score": 8
}
]
}