Security essentials for Flask APIs — CORS, Talisman security headers, rate
99
94%
Does it follow best practices?
Impact
100%
1.17xAverage score across 10 eval scenarios
Passed
No known issues
{
"context": "The agent was asked to add secure error handlers to a Flask financial API that was leaking exception details in HTTP responses. The scorer checks api.py for a 500 handler that returns a generic message, logs internally, and a 404 handler that returns JSON.",
"type": "weighted_checklist",
"checklist": [
{
"name": "500 handler registered",
"description": "api.py has a function decorated with @app.errorhandler(500) or equivalent",
"max_score": 12
},
{
"name": "500 returns generic message",
"description": "The 500 handler returns jsonify({'error': 'Internal server error'}) or a similarly generic message — NOT str(e) or any exception detail",
"max_score": 15
},
{
"name": "500 returns correct status code",
"description": "The 500 handler returns HTTP status code 500",
"max_score": 8
},
{
"name": "No str(e) in 500 response",
"description": "The 500 handler does NOT include str(e), str(error), or any variable derived from the exception in the response body",
"max_score": 15
},
{
"name": "app.logger.exception() called",
"description": "The 500 handler calls app.logger.exception(...) or app.logger.error(...) to log the exception internally",
"max_score": 15
},
{
"name": "404 handler registered",
"description": "api.py has a function decorated with @app.errorhandler(404) or equivalent",
"max_score": 12
},
{
"name": "404 returns JSON",
"description": "The 404 handler returns a jsonify() response (not an HTML string or default Flask response)",
"max_score": 10
},
{
"name": "404 returns correct status code",
"description": "The 404 handler returns HTTP status code 404",
"max_score": 8
},
{
"name": "Original routes preserved",
"description": "Both original routes (GET /accounts/<id> and POST /accounts) are still present and functional in api.py",
"max_score": 5
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
skills
flask-security-basics
verifiers