ab-test-setup

Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness.

Install with Tessl CLI

npx tessl i github:sickn33/antigravity-awesome-skills --skill ab-test-setup

What are skills?

1.09x

Quality

47%

Does it follow best practices?

Impact

100%

1.09x

Average score across 3 eval scenarios

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/ab-test-setup/SKILL.md

SKILL.md

Review

Evals

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a clear domain (A/B testing) and hints at a structured process with checkpoints, but lacks explicit trigger guidance and comprehensive action details. It would benefit from a 'Use when...' clause and more natural keyword variations to help Claude reliably select this skill when users discuss experimentation.

Suggestions

Add a 'Use when...' clause with trigger terms like 'A/B test', 'split test', 'experiment setup', 'hypothesis validation', 'test metrics'

Expand specific actions beyond 'setting up' to include concrete steps like 'define hypothesis, select metrics, determine sample size, validate statistical significance'

Include common keyword variations users might say: 'experiment', 'split testing', 'variant testing', 'conversion optimization'

Dimension	Reasoning	Score
Specificity	Names the domain (A/B tests) and mentions some actions (setting up, gates for hypothesis, metrics, execution readiness), but doesn't list multiple concrete specific actions like 'define control groups, calculate sample sizes, track conversion rates'.	2 / 3
Completeness	Describes what it does (structured guide for A/B test setup with gates) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill.	1 / 3
Trigger Term Quality	Includes 'A/B tests' which is a natural term users would say, but misses common variations like 'split testing', 'experiment', 'variant testing', 'feature flags', or 'conversion testing'.	2 / 3
Distinctiveness Conflict Risk	The A/B testing focus provides some distinctiveness, but 'structured guide' and 'mandatory gates' are generic patterns that could overlap with other process/workflow skills. The specific mention of hypothesis and metrics helps somewhat.	2 / 3
	Total	7 / 12 Passed

Implementation

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a well-structured A/B testing workflow with strong gating mechanisms and clear sequencing. However, it lacks concrete examples (sample hypotheses, calculation formulas, specific tools) and includes some unnecessary motivational content. The workflow clarity is excellent, but actionability suffers from abstract guidance rather than executable specifics.

Suggestions

Add a concrete example of a well-formed hypothesis with all required components (observation, change, expectation, audience, success criteria)

Include a sample size calculation formula or reference to a specific calculator tool with example inputs/outputs

Remove philosophical statements like 'A/B testing is not about proving ideas right' and 'Final Reminder' section - Claude doesn't need motivation

Consider splitting detailed sections (Metrics Definition, Analysis Discipline) into separate reference files to improve progressive disclosure

Dimension	Reasoning	Score
Conciseness	The content is reasonably efficient but includes some unnecessary philosophical statements ('A/B testing is not about proving ideas right') and motivational reminders that Claude doesn't need. The checklists and tables are well-structured, but some sections could be tightened.	2 / 3
Actionability	Provides clear checklists and decision criteria, but lacks concrete examples of what a good hypothesis looks like, no sample size calculation formulas or tools, and no specific code/commands for tracking verification. The guidance is structured but abstract.	2 / 3
Workflow Clarity	Excellent multi-step workflow with explicit hard gates ('Hypothesis Lock', 'Execution Readiness Gate'), clear sequencing from hypothesis through analysis, and explicit validation checkpoints. The 'Do NOT proceed' statements create clear feedback loops.	3 / 3
Progressive Disclosure	Content is well-organized with clear sections and headers, but everything is in a single monolithic file. For a skill of this length (~200 lines), detailed sections like 'Metrics Definition' or 'Analysis Discipline' could be split into separate reference files.	2 / 3
	Total	9 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Reviewed: about 5 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.