CtrlK
BlogDocsLog inGet started
Tessl Logo

test-results-analyzer

Parses and analyzes test output to identify failing tests, flaky tests, coverage gaps, defect patterns, and release readiness. Use when the user shares test results, CI/CD pipeline output, pytest/JUnit/Mocha logs, test coverage reports, or asks about test failures, pass rates, or quality metrics. Produces structured reports with root cause analysis, risk assessment, and prioritized recommendations for development teams and stakeholders.

87

Quality

85%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Test Results Analyzer

Parses raw test data into actionable insights: failure root causes, coverage gaps, defect trends, and release readiness assessments. Applies statistical methods to validate findings and produces stakeholder-specific reports.


Workflow

Step 1: Data Collection and Validation

  • Ingest test results from provided files, logs, or pasted output
  • Validate completeness: confirm total tests, pass count, fail count, and coverage data are present
  • If data is missing or ambiguous, ask the user to clarify before proceeding
  • Normalize metrics across frameworks (e.g., map pytest output to pass/fail/skip counts)

Step 2: Statistical Analysis and Pattern Recognition

  • Calculate pass rate, failure rate, and flaky test rate with sample size noted
  • Apply significance thresholds: flag trends only when sample size ≥ 30 or p < 0.05
  • Perform correlation analysis between failure clusters and code modules
  • Flag outliers (e.g., a test suite whose failure rate is >2 standard deviations from baseline)
  • If data is insufficient for statistical significance, report raw counts and note the limitation explicitly

Step 3: Risk Assessment

  • Score release readiness against defined criteria (see thresholds below)
  • Categorize defects by severity and area (functional, performance, security, integration)
  • Identify coverage gaps in high-risk or high-churn modules
  • Produce a go/no-go recommendation with supporting evidence and confidence level

Step 4: Reporting

  • Select report format based on audience: executive summary or technical detail (see TEMPLATES.md)
  • Include all findings, assumptions, and data limitations
  • Prioritize recommendations by estimated impact and implementation effort

Release Readiness Thresholds

CriterionGO ThresholdCONDITIONALNO-GO
Test pass rate≥ 95%90–94%< 90%
Line coverage≥ 80%70–79%< 70%
Critical defects open0≥ 1
Performance SLA metYesDegraded < 10%Degraded ≥ 10%
Security tests passing100%< 100%

Adjust thresholds when the user specifies project-specific standards.


Executable Example: Parsing JUnit XML

Full implementations of parse_junit_xml and detect_flaky_tests are in EXAMPLES.md. Invoke them as follows:

# Parse a JUnit XML report
report = parse_junit_xml("test-results.xml")
print(f"Pass rate: {report['summary']['pass_rate']}%")
print(f"Failures by class: {report['failures_by_class']}")

# Detect flaky tests across multiple run result dicts {test_name: "pass"|"fail"}
flaky = detect_flaky_tests(run_results)
print(f"Flaky tests: {flaky}")

Handling Incomplete or Ambiguous Data

  • Missing coverage data: Report pass/fail metrics only; note coverage is unavailable and recommend running with a coverage tool (e.g., pytest --cov, nyc, jacoco).
  • Single test run: Cannot compute flaky test rate; note that multiple runs are needed.
  • No historical baseline: Report absolute values; flag that trend analysis requires prior run data.
  • Ambiguous failure messages: Quote the raw error and ask the user for context before diagnosing root cause.
Repository
OpenRoster-ai/awesome-agents
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.