Identifies non-deterministic or unreliable tests through static code analysis and test result analysis. Use when Claude needs to find flaky tests, analyze test reliability, or investigate intermittent test failures. Supports Python (pytest, unittest) and Java (JUnit, TestNG) test frameworks. Trigger when users mention "flaky tests", "intermittent failures", "non-deterministic tests", "unreliable tests", or ask to "find flaky tests", "analyze test stability", or "why tests fail randomly".
Install with Tessl CLI
npx tessl i github:ArabelaTso/Skills-4-SE --skill flaky-test-detector94
Does it follow best practices?
Validation for skill structure
Test result analysis workflow
Script invocation
80%
100%
analysis_output.txt created
100%
100%
Report summary section
100%
100%
High priority section
100%
100%
Category grouping
100%
100%
Flakiness scores reported
70%
100%
Pass/fail pattern reported
100%
100%
Pass rate reported
87%
100%
Alternating pattern noted
100%
100%
Duration variance noted
100%
100%
Specific test details
100%
100%
Without context: $0.3675 · 1m 46s · 15 turns · 22 in / 6,621 out tokens
With context: $0.5926 · 1m 52s · 23 turns · 5,592 in / 7,161 out tokens
Python static analysis and remediation
Report structure per issue
100%
100%
Line numbers included
100%
100%
Issues grouped by category
100%
100%
Shared state identified
100%
100%
Fixed: setUp/fixture used
87%
100%
Fixed: tmp_path or tempfile
100%
100%
Fixed: time mock
100%
100%
Fixed: network mock
100%
100%
Fixed: sleep removed
100%
100%
Fixed: random seeded
100%
100%
Confidence or priority noted
100%
100%
File cleanup issue noted
100%
100%
Without context: $0.4350 · 2m 5s · 18 turns · 25 in / 7,425 out tokens
With context: $0.7972 · 3m 12s · 25 turns · 30 in / 10,966 out tokens
Java test flakiness detection and fix
Report structure per issue
100%
100%
Line numbers in report
100%
100%
Issues grouped by category
100%
100%
Fixed: @Before setup
100%
100%
Fixed: @After teardown
50%
100%
Fixed: Thread.sleep removed
100%
100%
Fixed: Clock/time mock
0%
100%
Fixed: resource management
100%
100%
Fixed: race condition
100%
100%
Fixed: network mock
100%
100%
Confidence or severity noted
100%
0%
Shared static state identified
100%
100%
Without context: $0.3171 · 1m 55s · 13 turns · 20 in / 6,739 out tokens
With context: $0.7189 · 2m 47s · 25 turns · 4,222 in / 9,488 out tokens
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.