test-results-analyzer

Parses and analyzes test output to identify failing tests, flaky tests, coverage gaps, defect patterns, and release readiness. Use when the user shares test results, CI/CD pipeline output, pytest/JUnit/Mocha logs, test coverage reports, or asks about test failures, pass rates, or quality metrics. Produces structured reports with root cause analysis, risk assessment, and prioritized recommendations for development teams and stakeholders.

Quality

85%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Test Results Analyzer

Parses raw test data into actionable insights: failure root causes, coverage gaps, defect trends, and release readiness assessments. Applies statistical methods to validate findings and produces stakeholder-specific reports.

Workflow

Step 1: Data Collection and Validation

Ingest test results from provided files, logs, or pasted output
Validate completeness: confirm total tests, pass count, fail count, and coverage data are present
If data is missing or ambiguous, ask the user to clarify before proceeding
Normalize metrics across frameworks (e.g., map pytest output to pass/fail/skip counts)

Step 2: Statistical Analysis and Pattern Recognition

Calculate pass rate, failure rate, and flaky test rate with sample size noted
Apply significance thresholds: flag trends only when sample size ≥ 30 or p < 0.05
Perform correlation analysis between failure clusters and code modules
Flag outliers (e.g., a test suite whose failure rate is >2 standard deviations from baseline)
If data is insufficient for statistical significance, report raw counts and note the limitation explicitly

Step 3: Risk Assessment

Score release readiness against defined criteria (see thresholds below)
Categorize defects by severity and area (functional, performance, security, integration)
Identify coverage gaps in high-risk or high-churn modules
Produce a go/no-go recommendation with supporting evidence and confidence level

Step 4: Reporting

Select report format based on audience: executive summary or technical detail (see TEMPLATES.md)
Include all findings, assumptions, and data limitations
Prioritize recommendations by estimated impact and implementation effort

Release Readiness Thresholds

Criterion	GO Threshold	CONDITIONAL	NO-GO
Test pass rate	≥ 95%	90–94%	< 90%
Line coverage	≥ 80%	70–79%	< 70%
Critical defects open	0	—	≥ 1
Performance SLA met	Yes	Degraded < 10%	Degraded ≥ 10%
Security tests passing	100%	—	< 100%

Adjust thresholds when the user specifies project-specific standards.

Executable Example: Parsing JUnit XML

Full implementations of parse_junit_xml and detect_flaky_tests are in EXAMPLES.md. Invoke them as follows:

# Parse a JUnit XML report
report = parse_junit_xml("test-results.xml")
print(f"Pass rate: {report['summary']['pass_rate']}%")
print(f"Failures by class: {report['failures_by_class']}")

# Detect flaky tests across multiple run result dicts {test_name: "pass"|"fail"}
flaky = detect_flaky_tests(run_results)
print(f"Flaky tests: {flaky}")

Handling Incomplete or Ambiguous Data

Missing coverage data: Report pass/fail metrics only; note coverage is unavailable and recommend running with a coverage tool (e.g., pytest --cov, nyc, jacoco).
Single test run: Cannot compute flaky test rate; note that multiple runs are needed.
No historical baseline: Report absolute values; flag that trend analysis requires prior run data.
Ambiguous failure messages: Quote the raw error and ask the user for context before diagnosing root cause.

Repository: OpenRoster-ai/awesome-agents
Commit: 010799b

Last updated: 4 months ago
Created: 4 months ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.