Flaky Test Detector - Auto-activating skill for Test Automation. Triggers on: flaky test detector, flaky test detector Part of the Test Automation skill category.
Install with Tessl CLI
npx tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill flaky-test-detectorOverall
score
19%
Does it follow best practices?
Validation for skill structure
Activation
7%This description is severely underdeveloped, essentially just restating the skill name without explaining what the skill actually does or when it should be used. It lacks any concrete actions, meaningful trigger terms, or explicit usage guidance. The description would be nearly useless for Claude to distinguish this skill from others in a large skill library.
Suggestions
Add specific actions the skill performs, e.g., 'Identifies flaky tests by analyzing test history, detects intermittent failures, suggests fixes for race conditions and timing issues'
Replace the redundant trigger terms with natural phrases users would say: 'test sometimes passes sometimes fails', 'intermittent test failure', 'random test failures', 'unstable tests', 'non-deterministic tests'
Add an explicit 'Use when...' clause describing scenarios: 'Use when tests fail inconsistently, when debugging intermittent CI failures, or when the user mentions tests that pass locally but fail in CI'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description only names the skill ('Flaky Test Detector') without describing any concrete actions. There are no verbs indicating what the skill actually does - no mention of detecting, analyzing, identifying, or fixing flaky tests. | 1 / 3 |
Completeness | The description fails to answer 'what does this do' (no actions described) and 'when should Claude use it' (no explicit use cases or scenarios). The 'Triggers on' line just repeats the skill name rather than providing meaningful trigger guidance. | 1 / 3 |
Trigger Term Quality | The trigger terms are just 'flaky test detector' repeated twice, which is the skill name itself rather than natural phrases users would say. Missing natural terms like 'intermittent failures', 'random test failures', 'unstable tests', 'test sometimes fails'. | 1 / 3 |
Distinctiveness Conflict Risk | The term 'flaky test' is somewhat specific to a particular testing problem, which provides some distinctiveness. However, the lack of detail about what specifically this skill does versus other test-related skills creates potential overlap risk. | 2 / 3 |
Total | 5 / 12 Passed |
Implementation
0%This skill content is a generic template with no actual substance about flaky test detection. It contains zero actionable information—no techniques for identifying flaky tests (timing issues, race conditions, external dependencies), no code examples for retry logic or test isolation, and no workflow for diagnosing and fixing flaky tests. The content could apply to literally any skill by swapping the name.
Suggestions
Add concrete techniques for identifying flaky tests: retry detection, timing analysis, isolation testing, and specific code examples for common frameworks (Jest, pytest)
Include a workflow for diagnosing flaky tests: 1) Run test N times, 2) Analyze failure patterns, 3) Check for common causes (async timing, shared state, external dependencies), 4) Apply specific fixes
Provide executable code examples showing how to implement retry logic, mock time-dependent operations, and isolate tests from external dependencies
Remove all generic boilerplate ('provides automated assistance', 'follows industry best practices') and replace with specific, actionable content about flaky test patterns
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is padded with generic boilerplate that explains nothing specific about flaky test detection. Phrases like 'provides automated assistance' and 'follows industry best practices' are filler that Claude doesn't need. | 1 / 3 |
Actionability | No concrete code, commands, or specific techniques for detecting flaky tests are provided. The content describes what the skill does abstractly but gives zero executable guidance on how to actually identify or fix flaky tests. | 1 / 3 |
Workflow Clarity | No workflow is defined. There are no steps for detecting flaky tests, no validation checkpoints, and no process for analyzing test results or implementing fixes. | 1 / 3 |
Progressive Disclosure | The content is a monolithic block of generic text with no references to detailed documentation, examples, or related files. There's no structure that would help Claude navigate to more specific information. | 1 / 3 |
Total | 4 / 12 Passed |
Validation
69%Validation — 11 / 16 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
description_trigger_hint | Description may be missing an explicit 'when to use' trigger hint (e.g., 'Use when...') | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
metadata_version | 'metadata' field is not a dictionary | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
body_steps | No step-by-step structure detected (no ordered list); consider adding a simple workflow | Warning |
Total | 11 / 16 Passed | |
Reviewed
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.