When the user wants to plan, design, or implement an A/B test or experiment. Also use when the user mentions "A/B test," "split test," "experiment," "test this change," "variant copy," "multivariate test," or "hypothesis." For tracking implementation, see analytics-tracking.
Install with Tessl CLI
npx tessl i github:coreyhaines31/marketingskills --skill ab-test-setup90
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-structured skill description with excellent trigger term coverage and clear disambiguation from related skills. The main weakness is that the 'what' portion could be more specific about concrete capabilities (e.g., sample size calculation, statistical analysis, variant creation). The description effectively prioritizes the 'when' clause which aids skill selection.
Suggestions
Expand the capabilities section with specific concrete actions like 'calculate sample sizes, design test variants, analyze statistical significance, write hypothesis statements'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (A/B testing/experiments) and mentions actions like 'plan, design, or implement,' but doesn't list specific concrete capabilities like 'create test variants, calculate sample sizes, analyze results.' | 2 / 3 |
Completeness | Explicitly answers both what ('plan, design, or implement an A/B test or experiment') and when ('when the user wants to...' plus explicit trigger terms). Also includes helpful disambiguation pointing to analytics-tracking for related but distinct needs. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'A/B test,' 'split test,' 'experiment,' 'test this change,' 'variant copy,' 'multivariate test,' 'hypothesis' - these are all terms users naturally use when needing this skill. | 3 / 3 |
Distinctiveness Conflict Risk | Clear niche with distinct triggers specific to experimentation. The explicit disambiguation 'For tracking implementation, see analytics-tracking' actively reduces conflict risk with related skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, actionable skill that provides comprehensive A/B testing guidance with clear workflows and validation checkpoints. The main weakness is some verbosity in explaining concepts Claude already understands (statistical significance basics, what A/B testing is). The tables, checklists, and frameworks are highly practical and efficiently formatted.
Suggestions
Remove the opening persona statement ('You are an expert...') and basic concept explanations that Claude already knows
Trim the 'Statistical Significance' explanation section - Claude understands p-values and confidence intervals
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some unnecessary framing ('You are an expert...') and explanatory content that Claude already knows (what statistical significance means, basic A/B testing concepts). The tables and quick references are efficient, but the overall document could be tightened. | 2 / 3 |
Actionability | Provides concrete, actionable guidance throughout: specific hypothesis framework with fill-in structure, sample size tables with exact numbers, implementation checklists, and clear decision matrices. The examples are specific and the checklists are copy-paste ready. | 3 / 3 |
Workflow Clarity | Clear multi-step workflow with explicit checkpoints: Initial Assessment → Hypothesis → Design → Implementation → Pre-Launch Checklist → Running → Analysis. The pre-launch checklist and analysis checklist provide validation steps, and the 'DO/DON'T' sections during test execution provide clear guardrails. | 3 / 3 |
Progressive Disclosure | Well-structured with clear sections and appropriate references to external files (references/sample-size-guide.md, references/test-templates.md). Content is organized logically from assessment through analysis, with related skills clearly signaled at the end. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.