Statistical Significance Calculator - Auto-activating skill for Data Analytics. Triggers on: statistical significance calculator, statistical significance calculator Part of the Data Analytics skill category.
35
3%
Does it follow best practices?
Impact
94%
1.02xAverage score across 3 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./planned-skills/generated/12-data-analytics/statistical-significance-calculator/SKILL.mdQuality
Discovery
7%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is extremely weak — it essentially repeats the skill name without describing any concrete capabilities, natural trigger terms, or explicit usage guidance. It reads as auto-generated boilerplate rather than a thoughtful description designed to help Claude select the right skill. The duplicate trigger term and lack of a 'Use when...' clause are significant deficiencies.
Suggestions
Add specific concrete actions the skill performs, e.g., 'Calculates p-values, performs hypothesis tests (t-test, chi-square, z-test), computes confidence intervals, and evaluates A/B test results.'
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user asks about statistical significance, p-values, A/B testing, hypothesis testing, sample size calculations, or confidence intervals.'
Remove the duplicate trigger term and replace with diverse natural language variations users would actually say, such as 'significance test', 'is this result significant', 'compare two groups', 'A/B test analysis'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain ('Data Analytics') and the tool name ('Statistical Significance Calculator') but does not describe any concrete actions like computing p-values, comparing sample means, running hypothesis tests, etc. It is essentially just a label repeated. | 1 / 3 |
Completeness | The description fails to clearly answer 'what does this do' beyond naming itself, and there is no explicit 'when should Claude use it' clause. The 'Triggers on' line just repeats the skill name rather than providing meaningful trigger guidance. | 1 / 3 |
Trigger Term Quality | The only trigger term listed is 'statistical significance calculator' repeated twice. It misses natural variations users would say such as 'p-value', 'hypothesis test', 'A/B test', 'significance level', 'chi-square', 'confidence interval', etc. | 1 / 3 |
Distinctiveness Conflict Risk | The phrase 'statistical significance calculator' is somewhat specific to a niche domain, which reduces conflict risk compared to fully generic descriptions. However, the broad 'Data Analytics' category label and lack of concrete actions could still cause overlap with other analytics-related skills. | 2 / 3 |
Total | 5 / 12 Passed |
Implementation
0%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is essentially a placeholder template with no substantive content. It contains no statistical formulas, no executable code, no concrete examples, and no actual guidance for performing statistical significance calculations. Every section merely restates that the skill is about 'statistical significance calculator' without teaching Claude anything it doesn't already know.
Suggestions
Add executable code examples for common statistical significance tests (e.g., z-test, t-test, chi-squared) with Python/scipy, including sample inputs and expected outputs.
Include the actual formulas and decision criteria (e.g., when to use which test, how to interpret p-values, sample size requirements) as concrete, actionable guidance.
Define a clear workflow: 1) Identify test type based on data characteristics, 2) Check assumptions, 3) Calculate test statistic, 4) Interpret results with confidence intervals.
Remove all meta-description sections (Purpose, When to Use, Example Triggers) that provide no actionable information and replace with actual statistical calculation content.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is entirely filler and meta-description. It explains what the skill does in abstract terms without providing any actual statistical significance calculation guidance, formulas, code, or concrete information. Every section restates the same vague concept. | 1 / 3 |
Actionability | There is zero actionable content—no formulas, no code, no specific statistical tests, no examples of inputs/outputs. The skill describes rather than instructs, offering only vague promises like 'provides step-by-step guidance' without actually providing any. | 1 / 3 |
Workflow Clarity | No workflow is defined at all. There are no steps, no sequence, no validation checkpoints. The content merely lists abstract capabilities without any process for performing statistical significance calculations. | 1 / 3 |
Progressive Disclosure | The content is a monolithic block of meta-description with no meaningful structure. There are no references to detailed materials, no code examples to organize, and the sections that exist (Purpose, When to Use, Capabilities) all repeat the same non-information. | 1 / 3 |
Total | 4 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
933cf26
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.