CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl-labs/flask-testing

Write correct Flask tests -- app factory with test config, application context fixtures, database isolation, file uploads, auth testing, error handlers, mock.patch placement, and essential API test patterns

98

1.15x
Quality

99%

Does it follow best practices?

Impact

97%

1.15x

Average score across 5 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

criteria.jsonevals/scenario-2/

{
  "context": "Tests whether the agent proactively applies Flask testing best practices for a checkout flow with external service mocking. The task mentions the import paths for charge_card and send_confirmation_email but does NOT tell the agent where to patch -- the agent should patch where they are used, not where defined.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "mock.patch targets import location",
      "description": "The agent patches 'app.routes.orders.charge_card' and 'app.routes.orders.send_confirmation_email' (where they are used), NOT 'app.services.payment.charge_card' or 'app.services.email.send_confirmation_email' (where they are defined).",
      "max_score": 18
    },
    {
      "name": "Mock args before fixture args",
      "description": "When using @patch with pytest, mock arguments come before fixture arguments in the function signature (e.g., def test_checkout(mock_charge, mock_email, auth_client)).",
      "max_score": 12
    },
    {
      "name": "App context in fixture",
      "description": "The app fixture wraps in 'with app.app_context():' and yields from inside it.",
      "max_score": 12
    },
    {
      "name": "SQLAlchemy test isolation",
      "description": "Uses test database with db.create_all() and proper cleanup (drop_all or rollback). Function-scoped, not session-scoped.",
      "max_score": 10
    },
    {
      "name": "Auth fixture",
      "description": "Creates an auth_client or authenticated client fixture that logs in through the /auth/login endpoint.",
      "max_score": 10
    },
    {
      "name": "Happy path checkout test",
      "description": "Tests the full checkout flow: create cart, add items, checkout, verify order created with status='confirmed', verify stock decremented.",
      "max_score": 10
    },
    {
      "name": "Payment failure test",
      "description": "Tests that when charge_card raises PaymentError, the order is created with status='pending' and send_confirmation_email is NOT called.",
      "max_score": 10
    },
    {
      "name": "Validation tests",
      "description": "Tests edge cases: empty cart, out-of-stock items, cart belonging to different user, nonexistent cart_id.",
      "max_score": 8
    },
    {
      "name": "Test config with TESTING=True",
      "description": "Passes test config with TESTING=True and separate database URI.",
      "max_score": 6
    },
    {
      "name": "Email mock assertion",
      "description": "On successful checkout, asserts that send_confirmation_email was called (mock_email.assert_called_once or similar).",
      "max_score": 4
    }
  ]
}

evals

tile.json