CtrlK
BlogDocsLog inGet started
Tessl Logo

growth-experiment

Experimentation Agent. A/B 테스트 설계, 가설 검증, 통계 분석을 담당합니다.

Install with Tessl CLI

npx tessl i github:shaul1991/shaul-agents-plugin --skill growth-experiment
What are skills?

40

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a clear domain (experimentation/A/B testing) but suffers from lack of explicit trigger guidance and insufficient specificity in concrete actions. The Korean-only text limits trigger term coverage, and the absence of a 'Use when...' clause makes it difficult for Claude to know when to select this skill over others.

Suggestions

Add an explicit 'Use when...' clause with trigger scenarios like 'Use when the user asks about A/B tests, experiment design, sample size calculation, statistical significance, or hypothesis testing'

List more specific concrete actions such as 'calculate required sample sizes', 'determine statistical significance', 'design experiment variants', 'analyze conversion rates'

Include both Korean and English trigger terms to improve coverage: 'A/B test', 'split test', 'experiment', 'p-value', 'conversion optimization'

DimensionReasoningScore

Specificity

Names the domain (A/B testing, hypothesis verification, statistical analysis) and some actions, but lacks concrete specific actions like 'calculate sample sizes', 'determine statistical significance', or 'design control groups'.

2 / 3

Completeness

Describes what it does (A/B test design, hypothesis verification, statistical analysis) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill.

1 / 3

Trigger Term Quality

Includes relevant terms like 'A/B 테스트', '가설 검증', '통계 분석' but missing common variations users might say such as 'experiment', 'split test', 'significance testing', 'p-value', or English equivalents.

2 / 3

Distinctiveness Conflict Risk

The A/B testing and experimentation focus provides some distinctiveness, but '통계 분석' (statistical analysis) is broad and could overlap with general data analysis or statistics skills.

2 / 3

Total

7

/

12

Passed

Implementation

22%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is essentially a role description rather than actionable guidance. It lists what an experimentation agent does but provides zero concrete instructions on how to do any of it - no statistical methods, no code for sample size calculations, no experiment design templates, and no analysis workflows.

Suggestions

Add executable code examples for sample size calculation (e.g., using scipy.stats or statsmodels with specific parameters)

Include a concrete workflow with validation steps: hypothesis → sample size → randomization → data collection → statistical test → decision criteria

Provide specific statistical thresholds and decision rules (e.g., 'Use α=0.05, power=0.8, minimum detectable effect of X%')

Add an example experiment design template showing required fields and expected output format

DimensionReasoningScore

Conciseness

The content is brief but lacks substance - it's concise by omission rather than by efficient information density. The bullet points are high-level categories without actionable detail.

2 / 3

Actionability

Completely vague and abstract. No concrete code, commands, statistical formulas, sample size calculations, or specific examples of how to design or analyze A/B tests. Describes responsibilities rather than instructs.

1 / 3

Workflow Clarity

Lists tasks as bullet points but provides no sequence, no validation checkpoints, and no guidance on how to actually execute any step. Missing critical details like statistical significance thresholds or decision criteria.

1 / 3

Progressive Disclosure

References an output location which is good, but the skill itself is too sparse to need progressive disclosure. No links to detailed methodology, statistical references, or example experiments.

2 / 3

Total

6

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

Total

10

/

11

Passed

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.