craft-experiment-design

Write a hypothesis, define success metrics, and plan a holdout strategy. Use when designing A/B tests or experiment plans.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A lean, well-organized instruction skill whose prompt template is concrete and copy-paste ready with a clear single action. Its main weakness is minor redundancy and an explanatory intro paragraph that assume less of Claude's competence than warranted.

Suggestions

Drop the bold summary line that duplicates the description, and tighten the intro paragraph to a single sentence or remove it so every token earns its place.

Consider adding one brief example of a well-formed hypothesis (e.g. 'If we shorten checkout, then conversion rises, because fewer steps reduce drop-off') to make the hypothesis format immediately concrete.

Dimension	Reasoning	Score
Conciseness	The body is mostly efficient, but the bold summary duplicates the description verbatim and the intro paragraph ('You want to run an A/B test but need to get the plan straight first...') is unnecessary explanatory padding that could be trimmed.	2 / 3
Actionability	The prompt template is concrete and copy-paste ready, specifying eight numbered deliverables with a falsifiable-hypothesis format and a fallback for blank input; for an instruction-only skill this meets the anchor-3 actionable bar.	3 / 3
Workflow Clarity	This is a simple single-task skill whose single action (run the prompt template) is unambiguous, and it is not a destructive or batch operation, so the cap-2 validation guideline does not apply.	3 / 3
Progressive Disclosure	Under 50 lines with no need for external references, the content is organized into clear sections (Prompt Template, Tips), satisfying the simple-skill anchor-3 bar for progressive disclosure.	3 / 3
	Total	11 / 12 Passed

Description

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A concise, well-structured description that clearly states both capability and trigger conditions with concrete actions and a distinct niche. Its only weakness is limited trigger-term coverage, which could be broadened with a few more natural phrasings users might say.

Suggestions

Broaden trigger terms to include common variations users might say, e.g. 'Use when designing A/B tests, split tests, experiments, or planning a holdout strategy for a launch.'

Dimension	Reasoning	Score
Specificity	Lists three concrete actions ('Write a hypothesis, define success metrics, and plan a holdout strategy'), matching the anchor for multiple specific concrete actions; uses base-verb imperative voice consistent with the good examples rather than first/second person.	3 / 3
Completeness	Explicitly answers what ('Write a hypothesis, define success metrics, and plan a holdout strategy') and when via an explicit 'Use when designing A/B tests or experiment plans' trigger clause, satisfying the anchor-3 'what AND when' bar.	3 / 3
Trigger Term Quality	'A/B tests' and 'experiment plans' are natural terms a user would say, but coverage is thin relative to the anchor-3 example and omits common variations such as 'split test', 'experiment', or 'hypothesis'.	2 / 3
Distinctiveness Conflict Risk	Experiment design and holdout-strategy planning is a clear niche with distinct triggers ('A/B tests', 'experiment plans', 'holdout strategy') unlikely to collide with other skills.	3 / 3
	Total	11 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	15 / 16 Passed

Repository: amplitude/builder-skills
Commit: 22b0634

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.