Generates **property-based tests** that use randomized input generation to validate invariants and contracts (rather than hand-picked examples). Triggers when the conversation involves: PBT frameworks (Hypothesis library for Python, fast-check for TypeScript, proptest for Rust, rapid for Go, RapidCheck for C++); concepts like invariants, contracts, round-trip symmetry, encode/decode, serialize/deserialize, generative testing, or shrinking; or requests to find edge cases that example-based tests miss — e.g., "find edge cases automatically", "test all possible inputs", "verify this property holds". Does NOT trigger for: writing regular example-based unit tests, debugging, CI/CD setup, UI/component testing, or integration/E2E testing. Identifies up to 7 property patterns (round-trip, idempotence, invariance, metamorphic, inverse, ordering, no-crash), designs input generators, writes property tests, and extracts regression tests from failures.
91
90%
Does it follow best practices?
Impact
94%
1.11xAverage score across 5 eval scenarios
Passed
No known issues
Test in terms of properties (invariants and contracts) rather than just hand-picked examples. Let the computer generate randomized inputs to find edge cases you'd never think of.
| Pattern | Example |
|---|---|
| Round-trip | JSON encode/decode, parse/unparse |
| Idempotence | Dedup, normalize, sanitize |
| Invariance | List length after sort, elements preserved |
| Metamorphic | Scaling both inputs scales output same |
| Inverse | Encrypt/decrypt, push/pop |
| Ordering | Sort result is non-decreasing |
| No crash | Parsing, validation |
Given a function/module, identify at least 2-3 properties:
If finding properties is hard, the function may have too many responsibilities or poorly defined contracts — consider refactoring or making side effects explicit before proceeding.
Select the standard PBT framework for the project's language (Hypothesis for Python, fast-check for TypeScript, proptest for Rust, rapid for Go, RapidCheck for C++). If the framework isn't installed, suggest adding it.
Define strategies that cover the full valid input space — not just "nice" inputs:
Key generator APIs by framework: st.integers() (Hypothesis), fc.integer() (fast-check), any::<i32>() (proptest), rapid.Int() (rapid), rc::gen::arbitrary<int>() (RapidCheck).
Write one test per property. Each test must:
Example — testing a JSON round-trip in Python with Hypothesis:
from hypothesis import given, strategies as st
import json
@given(st.dictionaries(
keys=st.text(min_size=1),
values=st.one_of(st.text(), st.integers(), st.floats(allow_nan=False))))
def test_json_roundtrip(data):
"""Round-trip property: serializing then deserializing yields the original."""
serialized = json.dumps(data)
deserialized = json.loads(serialized)
assert deserialized == dataWhen a property test fails, the framework provides the minimal failing input (shrunk):
in_stock())Always extract the failing case into a concrete, deterministic unit test:
def test_json_roundtrip_regression():
"""Regression test from property-based test failure."""
data = {"": 0} # Minimal failing case: empty string key
serialized = json.dumps(data)
deserialized = json.loads(serialized)
assert deserialized == dataThis locks in the fix and ensures it can't regress when randomized inputs change.
| Mistake | Fix |
|---|---|
| Weak properties ("runs without crashing") | Find meaningful invariants (round-trip, invariance) |
| Over-constrained generators (only "nice" inputs) | Cover the full valid input space, including edges |
| Skipping the shrink on failure | Always read the minimal failing case |
| Not extracting regression tests | File a concrete unit test with the shrunken input |
| Properties that mirror the buggy implementation | Use a different oracle (inverse op, alternative impl) |
| Too few iterations | Default 100 is fine; increase for complex data structures |