growth-experiment

Experimentation Agent. A/B 테스트 설계, 가설 검증, 통계 분석을 담당합니다.

Quality

27%

Does it follow best practices?

Run evals on this skill

Adds up to 20 points to the overall score

View guide

Securityby

Passed

No findings from the security scan

Fix and improve this skill with Tessl

tessl review fix ./skills/growth-experiment/SKILL.md

Quality

Content

22%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is essentially a role description rather than actionable guidance. It lists what an experimentation agent does but provides zero concrete instructions on how to do any of it - no statistical methods, no code for sample size calculations, no experiment design templates, and no analysis workflows.

Suggestions

Add executable code examples for sample size calculation (e.g., using scipy.stats or statsmodels with specific parameters)

Include a concrete workflow with validation steps: hypothesis → sample size → randomization → data collection → statistical test → decision criteria

Provide specific statistical thresholds and decision rules (e.g., 'Use α=0.05, power=0.8, minimum detectable effect of X%')

Add an example experiment design template showing required fields and expected output format

Dimension	Reasoning	Score
Conciseness	The content is brief but lacks substance - it's concise by omission rather than by efficient information density. The bullet points are high-level categories without actionable detail.	2 / 3
Actionability	Completely vague and abstract. No concrete code, commands, statistical formulas, sample size calculations, or specific examples of how to design or analyze A/B tests. Describes responsibilities rather than instructs.	1 / 3
Workflow Clarity	Lists tasks as bullet points but provides no sequence, no validation checkpoints, and no guidance on how to actually execute any step. Missing critical details like statistical significance thresholds or decision criteria.	1 / 3
Progressive Disclosure	References an output location which is good, but the skill itself is too sparse to need progressive disclosure. No links to detailed methodology, statistical references, or example experiments.	2 / 3
	Total	6 / 12 Passed

Description

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a clear domain (experimentation/A/B testing) but suffers from lack of explicit trigger guidance and insufficient specificity in concrete actions. The Korean-only text limits trigger term coverage, and the absence of a 'Use when...' clause makes it difficult for Claude to know when to select this skill over others.

Suggestions

Add an explicit 'Use when...' clause with trigger scenarios like 'Use when the user asks about A/B tests, experiment design, sample size calculation, statistical significance, or hypothesis testing'

List more specific concrete actions such as 'calculate required sample sizes', 'determine statistical significance', 'design experiment variants', 'analyze conversion rates'

Include both Korean and English trigger terms to improve coverage: 'A/B test', 'split test', 'experiment', 'p-value', 'conversion optimization'

Dimension	Reasoning	Score
Specificity	Names the domain (A/B testing, hypothesis verification, statistical analysis) and some actions, but lacks concrete specific actions like 'calculate sample sizes', 'determine statistical significance', or 'design control groups'.	2 / 3
Completeness	Describes what it does (A/B test design, hypothesis verification, statistical analysis) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill.	1 / 3
Trigger Term Quality	Includes relevant terms like 'A/B 테스트', '가설 검증', '통계 분석' but missing common variations users might say such as 'experiment', 'split test', 'significance testing', 'p-value', or English equivalents.	2 / 3
Distinctiveness Conflict Risk	The A/B testing and experimentation focus provides some distinctiveness, but '통계 분석' (statistical analysis) is broad and could overlap with general data analysis or statistics skills.	2 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning

	Total	10 / 11 Passed

Repository: shaul1991/shaul-agents-plugin
Path: skills/growth-experiment/SKILL.md
Commit: 9242c58

Reviewed: about 11 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.