CtrlK
BlogDocsLog inGet started
Tessl Logo

ab-test-setup

Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness.

53

Quality

41%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.agent/skills/ab-test-setup/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a clear domain (A/B testing) and hints at a structured approach with gates, but lacks explicit trigger guidance for when to use the skill. It would benefit from more specific action verbs and natural keyword variations to improve discoverability and reduce ambiguity.

Suggestions

Add a 'Use when...' clause with trigger terms like 'A/B test', 'split test', 'experiment setup', 'test hypothesis', or 'conversion experiment'.

Include common keyword variations such as 'split testing', 'experiment', 'variant testing', 'multivariate' to improve trigger term coverage.

List more specific concrete actions like 'define hypotheses', 'select metrics', 'calculate sample size', 'validate experiment readiness' to clarify capabilities.

DimensionReasoningScore

Specificity

Names the domain (A/B tests) and mentions some actions (setting up, gates for hypothesis, metrics, execution readiness), but doesn't list comprehensive concrete actions like 'define control groups', 'calculate sample sizes', or 'analyze results'.

2 / 3

Completeness

Describes what it does (structured guide for A/B test setup with gates) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill.

1 / 3

Trigger Term Quality

Includes 'A/B tests' which is a natural term users would say, but misses common variations like 'split testing', 'experiment', 'variant testing', 'conversion testing', or 'multivariate test'.

2 / 3

Distinctiveness Conflict Risk

The focus on A/B tests with mandatory gates provides some distinctiveness, but 'hypothesis' and 'metrics' are generic terms that could overlap with general experiment planning or analytics skills.

2 / 3

Total

7

/

12

Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is well-structured as a navigation hub but fails as standalone documentation. It's concise to the point of being hollow - providing no actionable content, examples, or summary of what the gates actually check. The workflow is implied through numbering but the main file offers no value without reading all 11 sub-skills.

Suggestions

Add a brief summary of what each gate checks (e.g., 'Hypothesis Lock: requires falsifiable statement + primary metric defined') so users understand the workflow without opening every sub-file

Include at least one concrete example of a valid vs invalid hypothesis or test setup in the main skill

Add a quick-reference checklist or decision tree that summarizes the entire process in the main file

Specify what 'Hard Gate' means operationally - what happens if the gate fails, and what criteria must pass

DimensionReasoningScore

Conciseness

The content is extremely lean - no unnecessary explanations of what A/B testing is or how statistics work. Every line serves a purpose: scope, prerequisites, and navigation to sub-skills.

3 / 3

Actionability

The skill provides no concrete guidance, code, commands, or examples. It's entirely a table of contents with vague bullet points like 'Prevents peeking' without explaining how or what to do.

1 / 3

Workflow Clarity

The numbered sub-skills suggest a sequence (1-11), and 'Hard Gate' labels indicate checkpoints. However, the actual workflow steps, validation criteria, and feedback loops are entirely delegated to sub-files with no summary of what each gate requires.

2 / 3

Progressive Disclosure

References are one level deep and clearly linked, which is good. However, the main skill is essentially empty - it's all navigation with no substantive overview content. Users cannot understand the process without clicking through 11 separate files.

2 / 3

Total

8

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
Dokhacgiakhoa/antigravity-ide
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.