Name: evilissimo/property-based-testing
Rating: 91.6 (1 reviews)
Author: evilissimo

evilissimo/property-based-testing

Generates **property-based tests** that use randomized input generation to validate invariants and contracts (rather than hand-picked examples). Triggers when the conversation involves: PBT frameworks (Hypothesis library for Python, fast-check for TypeScript, proptest for Rust, rapid for Go, RapidCheck for C++); concepts like invariants, contracts, round-trip symmetry, encode/decode, serialize/deserialize, generative testing, or shrinking; or requests to find edge cases that example-based tests miss — e.g., "find edge cases automatically", "test all possible inputs", "verify this property holds". Does NOT trigger for: writing regular example-based unit tests, debugging, CI/CD setup, UI/component testing, or integration/E2E testing. Identifies up to 7 property patterns (round-trip, idempotence, invariance, metamorphic, inverse, ordering, no-crash), designs input generators, writes property tests, and extracts regression tests from failures.

1.11x

Quality

90%

Does it follow best practices?

Impact

94%

1.11x

Average score across 5 eval scenarios

Securityby

Passed

No known issues

{
  "context": "Tests whether the agent designs Python property-based tests around meaningful invariants, broad generators, Hypothesis APIs, and regression handling for failures.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Hypothesis dependency",
      "description": "Mentions or encodes adding Hypothesis/pytest if the project lacks them.",
      "max_score": 8
    },
    {
      "name": "Correct imports",
      "description": "Imports Hypothesis given and strategies as st in the test code.",
      "max_score": 8
    },
    {
      "name": "Given decorators",
      "description": "Uses @given decorators on property tests rather than only example tests.",
      "max_score": 8
    },
    {
      "name": "Multiple properties",
      "description": "Identifies at least three distinct properties or invariants in notes or test names/assertions.",
      "max_score": 10
    },
    {
      "name": "Full input space",
      "description": "Defines generated cart items including edge values such as empty collections, zero or negative quantities, duplicate IDs, and unusual strings or missing optional fields.",
      "max_score": 12
    },
    {
      "name": "Composed strategies",
      "description": "Composes primitive strategies into dictionaries/lists or structured cart item strategies.",
      "max_score": 8
    },
    {
      "name": "Generate act assert",
      "description": "Each property test has a clear generated input, calls normalize_cart, and asserts an invariant.",
      "max_score": 10
    },
    {
      "name": "Meaningful invariants",
      "description": "Checks semantic invariants such as positive output quantities, aggregation conservation, unique product IDs, or total equals rounded quantity times price.",
      "max_score": 12
    },
    {
      "name": "Avoids weak only",
      "description": "Does not rely solely on no-crash assertions when stronger invariants are available.",
      "max_score": 8
    },
    {
      "name": "Shrinking noted",
      "description": "Explains that failing property tests provide a minimal/shrunk counterexample to inspect.",
      "max_score": 8
    },
    {
      "name": "Regression extraction",
      "description": "Includes or describes adding a deterministic regression unit test from any failing shrunk case.",
      "max_score": 8
    }
  ]
}