Use when the user wants a test suite audit, test quality or reliability review, regression-protection review, unit/integration/e2e test review, coverage or CI signal assessment, flaky CI investigation, fixture-realism review, spec-drift review, or generated-test validation for AI/LLM/agent-written code. Produces severity-ranked findings for weak assertions, oracle gaps, brittle fixtures, over-mocking, CI trust, and generated-code test risks.
100
100%
Does it follow best practices?
Impact
100%
1.31xAverage score across 3 eval scenarios
Passed
No known issues
Use this skill to audit automated tests for relevance, validity, assertion and oracle strength, risk coverage, maintainability, CI signal quality, flakiness, fixture realism, and LLM-generated or agent-built codebase risks.
The skill is audit-first. It helps an agent inspect repo evidence, build a test system map, classify findings by severity, and produce a concrete remediation plan without rewriting tests unless the user explicitly asks for implementation.
tessl install sharaf/codebase-test-suite-audit| Path | Purpose |
|---|---|
skills/codebase-test-suite-audit/SKILL.md | Main workflow, evidence rules, and finding contract |
skills/codebase-test-suite-audit/references/report-template.md | Required report headings |
skills/codebase-test-suite-audit/references/evidence-inventory.md | Evidence statuses and sampling prompts |
skills/codebase-test-suite-audit/references/audit-domains.md | Domain-specific audit checks |
skills/codebase-test-suite-audit/references/guardrails-and-success.md | Severity guardrails and completion checks |
tile.json | Tessl tile manifest and registry summary |
README.md | Registry-facing overview |
The default deliverable is a test suite audit report with:
Tested on May 22, 2026 across three scenarios:
| Scenario | Baseline | With skill |
|---|---|---|
| Weak oracle and assertionless test detection | 53% | 100% |
| LLM-generated test validity and spec drift audit | 94% | 100% |
| Flaky CI signal and fixture realism audit | 82% | 100% |
| Average | 76% | 100% |
Activation: 3/3 scenarios naturally fired
tessl__codebase-test-suite-audit.
Single-scenario multi-model spot check:
| Model | Baseline | With skill |
|---|---|---|
claude-haiku-4-5 | 62% | 99% |
claude-sonnet-4-6 | 58% | 100% |
claude-opus-4-6 | 61% | 100% |