Use when the user wants a test suite audit, test quality or reliability review, regression-protection review, unit/integration/e2e test review, coverage or CI signal assessment, flaky CI investigation, fixture-realism review, spec-drift review, or generated-test validation for AI/LLM/agent-written code. Produces severity-ranked findings for weak assertions, oracle gaps, brittle fixtures, over-mocking, CI trust, and generated-code test risks.
100
100%
Does it follow best practices?
Impact
100%
1.31xAverage score across 3 eval scenarios
Passed
No known issues
Your team has been using an AI coding assistant to accelerate development of paymentcore, an internal Python library that handles payment charges, refunds, and batch processing for the company's e-commerce backend. The assistant helped write both the production code and the accompanying test suite. The library is now used by three downstream services that post ledger entries for every transaction.
A recent code review raised concerns: a senior engineer noticed that some of the tests look plausible on the surface but may not actually catch regressions. Two edge cases slipped to production in the last quarter without any CI signal. Before the team invests in expanding the test suite or wires it into a stricter release gate, they want an honest picture of what the current tests actually protect against.
The codebase is under inputs/. The source modules are in inputs/payment/, the tests are in inputs/tests/, and project configuration is in inputs/pyproject.toml.
Produce a file named audit_report.md in your working directory containing the full audit. The report must cover the reliability of the existing tests as regression protection, the quality of assertions, and any structural issues that would prevent the test suite from catching real faults. Include a prioritized remediation plan and list any evidence that was unavailable for the review.