investigate flaky, intermittent, non-reproducible, or ci-only python pytest failures by leading with retrace capture and deterministic replay. use when a user reports flaky pytest tests, random failures, tests that pass locally but fail in ci, async/threading/timing flakes, pytest-xdist issues, fixture leakage, monkeypatch leakage, test isolation failures, dependency/environment-sensitive failures, pytest timeouts, or ai-generated code that breaks tests intermittently. guide the agent to preserve the failed execution as a retrace trace, replay it, inspect runtime state, and use ordinary pytest/source/log checks only as supporting triage.
68
86%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Loading evals