CtrlK
BlogDocsLog inGet started
Tessl Logo

statistical-analysis

Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.

84

1.13x
Quality

75%

Does it follow best practices?

Impact

91%

1.13x

Average score across 6 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./scientific-skills/statistical-analysis/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly communicates specific capabilities (test selection, assumption checking, power analysis, APA formatting), includes natural trigger terms researchers would use, and explicitly delineates its boundary with a related skill (statsmodels). The only minor issue is the use of second person ('you need help', 'your data') which the rubric guidelines say should be penalized, but the overall quality is high.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: test selection, assumption checking, power analysis, and APA-formatted results. These are distinct, well-defined statistical analysis tasks.

3 / 3

Completeness

Clearly answers both what (guided statistical analysis with test selection, assumption checking, power analysis, APA-formatted reporting) and when ('Use when you need help choosing appropriate tests for your data'). Also includes a helpful boundary condition distinguishing it from statsmodels for programmatic implementation.

3 / 3

Trigger Term Quality

Includes strong natural keywords users would say: 'statistical analysis', 'test selection', 'assumption checking', 'power analysis', 'APA-formatted results', 'academic research reporting'. These are terms researchers and students naturally use.

3 / 3

Distinctiveness Conflict Risk

Clearly carves out a distinct niche focused on guided statistical test selection and academic reporting, and explicitly differentiates itself from programmatic statistical modeling ('For implementing specific models programmatically use statsmodels'). This boundary-setting reduces conflict risk significantly.

3 / 3

Total

12

/

12

Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill is highly actionable with excellent executable code examples and concrete APA reporting templates, which is its primary strength. However, it is significantly too verbose—it explains many concepts Claude already knows, includes best practices and pitfalls lists that are general knowledge, and inlines content that should live in the referenced files. The workflow structure exists but lacks explicit validation-feedback loops that would make the multi-step analysis process more robust.

Suggestions

Cut the content by 50-60%: remove 'When to Use This Skill' (redundant with frontmatter), 'Best Practices,' 'Common Pitfalls,' 'Key Advantages' of Bayesian methods, textbook recommendations, and online resources—Claude already knows these.

Move detailed code examples (regression diagnostics, Bayesian t-test, full ANOVA workflow) into reference files and keep only one compact example in the main skill to demonstrate the pattern.

Add explicit validation checkpoints to the workflow: e.g., 'Run assumption_checks.py → If violations found → apply remediation from table → re-check → only proceed when assumptions satisfied or alternative test selected.'

DimensionReasoningScore

Conciseness

The skill is extremely verbose at ~500+ lines. It explains concepts Claude already knows (what effect sizes are, what p-values mean, advantages of Bayesian methods, definitions of common pitfalls like p-hacking). Sections like 'When to Use This Skill,' 'Best Practices,' 'Common Pitfalls,' and 'Key Advantages' of Bayesian methods are largely redundant for Claude. The textbook recommendations and online resources at the end add no value.

1 / 3

Actionability

The skill provides fully executable Python code examples for t-tests, ANOVA, regression, Bayesian analysis, power analysis, and assumption checking. Code is copy-paste ready with specific library calls (pingouin, statsmodels, pymc) and concrete APA report templates that serve as actionable output examples.

3 / 3

Workflow Clarity

There is a decision tree and a getting-started checklist that outline the workflow sequence, and assumption checking is emphasized before analysis. However, the workflow lacks explicit validation checkpoints with feedback loops—there's no 'if assumption check fails, do X, then re-check' loop built into the main workflow, and the decision tree uses vague section references rather than concrete steps with verification gates.

2 / 3

Progressive Disclosure

The skill references external files (references/*.md, scripts/*.py) appropriately, but the main SKILL.md itself contains far too much inline content that should be in those reference files. The test selection guide, full code examples for every test type, effect size tables, and APA report templates could all be in referenced documents, keeping the main file as a concise overview.

2 / 3

Total

8

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (631 lines); consider splitting into references/ and linking

Warning

metadata_version

'metadata.version' is missing

Warning

Total

9

/

11

Passed

Repository
K-Dense-AI/claude-scientific-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.