ab-test-analyzer

Ab Test Analyzer - Auto-activating skill for Data Analytics. Triggers on: ab test analyzer, ab test analyzer Part of the Data Analytics skill category.

0.98x

Quality

Does it follow best practices?

Impact

98%

0.98x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./planned-skills/generated/12-data-analytics/ab-test-analyzer/SKILL.md

Quality

Discovery

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an extremely weak description that essentially just restates the skill name and category without providing any meaningful information about capabilities or usage triggers. It fails on all dimensions: no concrete actions, no natural trigger terms, no 'Use when' clause, and no distinguishing characteristics.

Suggestions

Add specific concrete actions the skill performs, e.g., 'Analyzes A/B test results by calculating statistical significance, comparing conversion rates between variants, determining sample size adequacy, and visualizing experiment outcomes.'

Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user mentions A/B tests, split tests, experiment analysis, conversion rates, statistical significance, variant comparison, or uplift calculations.'

Remove the duplicate trigger term 'ab test analyzer' and expand with natural language variations users would actually say, such as 'A/B test', 'split test', 'experiment results', 'control vs treatment'.

Dimension	Reasoning	Score
Specificity	The description provides no concrete actions whatsoever. It only states it is an 'ab test analyzer' and belongs to 'Data Analytics' but never describes what it actually does (e.g., calculate statistical significance, compare conversion rates, visualize results).	1 / 3
Completeness	Neither 'what does this do' nor 'when should Claude use it' is meaningfully answered. There is no explicit 'Use when...' clause, and the 'what' is just the skill name restated without any description of capabilities.	1 / 3
Trigger Term Quality	The only trigger terms listed are 'ab test analyzer' repeated twice. It misses natural variations users would say like 'A/B test', 'split test', 'experiment results', 'conversion rate', 'statistical significance', 'variant comparison', etc.	1 / 3
Distinctiveness Conflict Risk	The description is so vague that it could overlap with any data analytics skill. 'Data Analytics' is extremely broad, and without specific actions or triggers, there's nothing to distinguish it from other analytics-related skills.	1 / 3
	Total	4 / 12 Passed

Implementation

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is an empty template with no substantive content. It contains only generic boilerplate descriptions that could apply to any skill topic, with no actual guidance on A/B test analysis—no statistical methods (e.g., chi-squared tests, confidence intervals), no SQL examples, no visualization guidance, and no decision frameworks. It provides zero value beyond what its title alone conveys.

Suggestions

Add concrete, executable code examples for A/B test analysis: SQL queries to extract experiment data, Python/R code for statistical significance testing (e.g., chi-squared, t-tests, Bayesian methods), and sample visualization code.

Define a clear multi-step workflow: data extraction → sample size validation → statistical test selection → significance calculation → result interpretation → recommendation, with validation checkpoints at each stage.

Replace all generic boilerplate ('Provides step-by-step guidance', 'Follows industry best practices') with specific, actionable content such as formulas for minimum sample size, significance thresholds, and common pitfalls (e.g., peeking, multiple comparisons).

Add concrete input/output examples showing what an A/B test analysis request looks like and what the expected deliverable (e.g., a summary table with p-values, confidence intervals, and lift) should contain.

Dimension	Reasoning	Score
Conciseness	The content is entirely filler and boilerplate. It explains nothing Claude doesn't already know, provides no domain-specific information about A/B test analysis, and wastes tokens on generic phrases like 'Provides step-by-step guidance' and 'Follows industry best practices' without any substance.	1 / 3
Actionability	There is zero actionable content—no code, no commands, no statistical methods, no SQL queries, no concrete examples of how to actually analyze an A/B test. Every section is vague and abstract, describing what the skill supposedly does rather than instructing how to do it.	1 / 3
Workflow Clarity	No workflow is defined at all. There are no steps, no sequence, no validation checkpoints. For a task like A/B test analysis (which involves data extraction, statistical testing, interpretation, and decision-making), the complete absence of any process is a critical gap.	1 / 3
Progressive Disclosure	There are no references to supporting files, no bundle files exist, and the content is a monolithic block of generic placeholder text with no meaningful structure or navigation to deeper content.	1 / 3
	Total	4 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: jeremylongshore/claude-code-plugins-plus-skills
Commit: 3a2d27d

Reviewed: 3 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.