CtrlK
BlogDocsLog inGet started
Tessl Logo

ab-test-setup

Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness.

Install with Tessl CLI

npx tessl i github:boisenoise/skills-collections --skill ab-test-setup
What are skills?

56

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a clear domain (A/B testing) and hints at a structured process with mandatory checkpoints, but lacks explicit trigger guidance and comprehensive action details. The absence of a 'Use when...' clause significantly limits Claude's ability to know when to select this skill, and the description would benefit from more natural user-facing keywords.

Suggestions

Add a 'Use when...' clause with trigger terms like 'A/B test', 'split test', 'experiment setup', 'hypothesis validation', or 'test metrics'

List specific concrete actions such as 'define hypothesis, select metrics, calculate sample size, set success criteria, document test plan'

Include common variations users might say: 'split testing', 'experimentation', 'variant testing', 'conversion experiment'

DimensionReasoningScore

Specificity

Names the domain (A/B tests) and mentions some actions (setting up, gates for hypothesis, metrics, execution readiness), but doesn't list multiple concrete actions like 'define control groups, calculate sample sizes, track conversion rates'.

2 / 3

Completeness

Describes what it does (structured guide for A/B test setup with gates) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill.

1 / 3

Trigger Term Quality

Includes 'A/B tests' which is a natural term users would say, but misses common variations like 'split testing', 'experiment', 'variant testing', or 'conversion optimization'.

2 / 3

Distinctiveness Conflict Risk

A/B testing is a reasonably specific niche, but 'structured guide' and 'gates' are vague enough that it could overlap with other process/workflow skills or general experimentation guides.

2 / 3

Total

7

/

12

Passed

Implementation

62%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a well-structured, gate-based workflow for A/B test setup with strong validation checkpoints and clear sequencing. However, it lacks concrete examples (sample hypotheses, calculation formulas, specific tool integrations) that would make it immediately actionable. The content is moderately concise but could trim philosophical statements and benefit from splitting detailed sections into referenced files.

Suggestions

Add a concrete example hypothesis that demonstrates all required components (observation, change, direction, audience, MDE)

Include a sample size calculation formula or reference to a specific calculator tool with example inputs/outputs

Split detailed sections (Metrics Definition, Analysis Discipline) into separate referenced files to improve progressive disclosure

Remove or minimize philosophical statements like 'Final Reminder' section that don't add actionable guidance

DimensionReasoningScore

Conciseness

The content is reasonably efficient but includes some unnecessary philosophical statements ('A/B testing is not about proving ideas right') and redundant emphasis. The checklists are useful but could be tighter in places.

2 / 3

Actionability

Provides clear checklists and decision criteria, but lacks concrete examples of hypotheses, sample size calculations, or specific tool commands. The guidance is procedural but not executable—no code, formulas, or copy-paste ready artifacts.

2 / 3

Workflow Clarity

Excellent multi-step workflow with explicit hard gates ('Do NOT proceed until confirmed'), clear sequencing from hypothesis through execution, and validation checkpoints at each stage. The 'Execution Readiness Gate' is a strong example of a validation checkpoint.

3 / 3

Progressive Disclosure

Content is well-structured with clear sections and headers, but everything is in a single monolithic file. For a skill of this length (~150 lines), some content like the detailed metrics definitions or analysis discipline could be split into referenced files.

2 / 3

Total

9

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.