CtrlK
BlogDocsLog inGet started
Tessl Logo

sharaf/codebase-test-suite-audit

Use when the user wants a test suite audit, test quality or reliability review, regression-protection review, unit/integration/e2e test review, coverage or CI signal assessment, flaky CI investigation, fixture-realism review, spec-drift review, or generated-test validation for AI/LLM/agent-written code. Produces severity-ranked findings for weak assertions, oracle gaps, brittle fixtures, over-mocking, CI trust, and generated-code test risks.

100

1.31x
Quality

100%

Does it follow best practices?

Impact

100%

1.31x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

task.mdevals/scenario-3/

Test Suite Reliability Audit: ShipFast

Problem/Feature Description

The shipfast library schedules warehouse shipment batches and dispatches orders to carrier APIs. It is used by an operations backend where duplicate dispatches, missed shipments, or incorrect carrier payloads can create customer support incidents.

The team says the tests are "mostly green" locally, but CI failures are often rerun and the nightly job sometimes flakes without a clear owner. Before the team tightens release gates, they want an audit of whether the current tests provide reliable regression signal or mostly exercise happy paths.

The codebase is under inputs/. Source modules are in inputs/shipfast/, tests are in inputs/tests/, and project configuration is in inputs/pyproject.toml.

Output Specification

Produce a file named audit_report.md in your working directory containing the full audit. The report must assess flakiness and determinism, fixture realism, CI signal quality, assertion/oracle strength, and missing evidence. Include a prioritized remediation plan with verification steps.

evals

README.md

tile.json