Designs and tracks scientific experiments, A/B tests, and feature rollouts for product and engineering teams. Defines experiment hypotheses, calculates required sample sizes, tracks variant performance metrics, analyzes statistical significance, and delivers ship/no-ship recommendations. Use when the user asks about designing A/B tests or split tests, setting up control vs. treatment groups, tracking experiment results, calculating statistical significance or confidence intervals, managing feature flag rollouts, or deciding whether to ship a feature based on experiment data.
93
92%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Scanned
010799b
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.