CtrlK
BlogDocsLog inGet started
Tessl Logo

flaky-test-detector

Flaky Test Detector - Auto-activating skill for Test Automation. Triggers on: flaky test detector, flaky test detector Part of the Test Automation skill category.

35

0.98x
Quality

3%

Does it follow best practices?

Impact

91%

0.98x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./planned-skills/generated/09-test-automation/flaky-test-detector/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

7%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is essentially a label with no substantive content. It names the skill and its category but provides zero information about what concrete actions it performs, what inputs it expects, or when Claude should select it. The trigger terms are redundant and miss the many natural ways users might describe flaky test issues.

Suggestions

Add concrete action verbs describing what the skill does, e.g., 'Detects intermittently failing tests by analyzing test run history, identifies root causes of non-deterministic behavior, and suggests fixes or quarantine strategies.'

Add an explicit 'Use when...' clause with diverse trigger scenarios, e.g., 'Use when the user mentions flaky tests, intermittent failures, non-deterministic test results, randomly failing CI builds, or unreliable test suites.'

Remove the duplicated trigger term and expand with natural keyword variations users would actually say, such as 'test flakiness', 'unstable tests', 'tests passing sometimes', 'CI failures', or 'randomly broken tests'.

DimensionReasoningScore

Specificity

The description names the domain ('Test Automation') and the concept ('Flaky Test Detector') but provides no concrete actions. There is no indication of what the skill actually does—no verbs describing capabilities like 'identifies', 'reruns', 'analyzes', or 'reports'.

1 / 3

Completeness

The description fails to answer 'what does this do' beyond naming itself, and the 'when' clause is essentially absent—there is no explicit 'Use when...' guidance, only a redundant trigger phrase. Both dimensions are very weak.

1 / 3

Trigger Term Quality

The trigger terms listed are just 'flaky test detector' repeated twice. There are no natural variations a user might say, such as 'intermittent test failures', 'unreliable tests', 'test flakiness', 'non-deterministic tests', or 'randomly failing tests'.

1 / 3

Distinctiveness Conflict Risk

The term 'flaky test' is somewhat niche and unlikely to conflict with many other skills, giving it some distinctiveness. However, the lack of specific actions or clear scope means it could still overlap with general test analysis or debugging skills.

2 / 3

Total

5

/

12

Passed

Implementation

0%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is essentially a placeholder with no substantive content. It repeatedly references 'flaky test detector' without ever explaining what flaky tests are, how to detect them, or what to do about them. There is no actionable guidance, no code, no workflow, and no references to deeper materials—it is entirely meta-description.

Suggestions

Add concrete, executable code examples showing how to detect flaky tests (e.g., running a test suite N times, parsing results for inconsistent pass/fail patterns, using pytest-repeat or Jest --repeat flags).

Define a clear multi-step workflow: identify flaky tests → diagnose root causes (timing, shared state, external dependencies) → apply fixes → validate stability, with explicit validation checkpoints.

Remove all meta-description sections (Purpose, When to Use, Capabilities, Example Triggers) and replace with actual technical content—specific patterns, commands, and strategies for flaky test detection and remediation.

Add references to detailed guides for specific frameworks (e.g., a Jest flaky test guide, a pytest flaky test guide) to provide progressive disclosure.

DimensionReasoningScore

Conciseness

The content is entirely filler and meta-description. It explains what the skill does in abstract terms without providing any actual technical content. Every section restates the same vague idea—that this skill helps with 'flaky test detector'—without adding substance.

1 / 3

Actionability

There is zero concrete guidance—no code, no commands, no specific techniques for detecting or handling flaky tests. The content describes rather than instructs, offering only vague promises like 'provides step-by-step guidance' without actually delivering any.

1 / 3

Workflow Clarity

No workflow, steps, or process is defined. There are no sequences, no validation checkpoints, and no actual procedure for detecting or resolving flaky tests.

1 / 3

Progressive Disclosure

The content is a monolithic block of meta-description with no references to detailed materials, no links to examples or advanced guides, and no meaningful structural organization of actual content.

1 / 3

Total

4

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
jeremylongshore/claude-code-plugins-plus-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.