CtrlK
BlogDocsLog inGet started
Tessl Logo

statistical-analysis

Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.

Install with Tessl CLI

npx tessl i github:K-Dense-AI/claude-scientific-skills --skill statistical-analysis
What are skills?

Overall
score

80%

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description that clearly articulates what the skill does and when to use it, with good disambiguation from related skills. The main weakness is the trigger term coverage, which could benefit from more common statistical test names and user-facing terminology that researchers would naturally use when seeking help.

Suggestions

Add common statistical test names as trigger terms (e.g., 't-test', 'ANOVA', 'chi-square', 'regression', 'correlation') that users would naturally mention when seeking statistical help.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'test selection', 'assumption checking', 'power analysis', and 'APA-formatted results'. Also specifies use cases like 'academic research reporting' and 'test selection guidance'.

3 / 3

Completeness

Clearly answers both what (guided statistical analysis with test selection, assumption checking, power analysis, APA reporting) and when ('Use when you need help choosing appropriate tests', 'Best for academic research reporting'). Also includes helpful disambiguation from statsmodels.

3 / 3

Trigger Term Quality

Includes some relevant keywords like 'statistical analysis', 'power analysis', 'APA-formatted', and 'academic research', but misses common user terms like 't-test', 'ANOVA', 'regression', 'p-value', or 'significance testing' that users would naturally say.

2 / 3

Distinctiveness Conflict Risk

Clearly distinguishes itself with specific niche (guided analysis, APA formatting, academic reporting) and explicitly differentiates from programmatic implementation ('For implementing specific models programmatically use statsmodels'), reducing conflict risk.

3 / 3

Total

11

/

12

Passed

Implementation

73%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a comprehensive statistical analysis skill with excellent actionability through executable code examples and good progressive disclosure via well-organized reference files. However, it suffers from verbosity—explaining concepts Claude already knows (effect size interpretation, p-value misconceptions, research best practices) and could benefit from tighter validation workflows with explicit checkpoints before proceeding to analysis.

Suggestions

Remove or drastically condense 'Best Practices' and 'Common Pitfalls' sections—these are general research methodology principles Claude already knows

Add explicit validation gates in the workflow: 'STOP: Do not proceed to analysis until assumption_checks.py returns all_passed=True'

Condense the 'When to Use Bayesian Methods' and effect size interpretation sections—focus on the decision criteria, not explanations of why these matter

DimensionReasoningScore

Conciseness

The skill contains significant redundancy and explanatory content Claude already knows (e.g., explaining what effect sizes are, why to use Bayesian methods, common pitfalls). While organized, it could be tightened substantially—the 'Best Practices' and 'Common Pitfalls' sections largely state obvious research methodology principles.

2 / 3

Actionability

Provides fully executable Python code examples for t-tests, ANOVA, regression, and Bayesian analysis using real libraries (pingouin, statsmodels, pymc). Code is copy-paste ready with proper imports and complete workflows including diagnostic plots.

3 / 3

Workflow Clarity

Includes a decision tree and checklist, but validation checkpoints are implicit rather than explicit. The workflow lacks clear 'stop and verify' steps—for example, assumption checking is described but not enforced as a gate before proceeding. Missing explicit feedback loops for when assumptions fail.

2 / 3

Progressive Disclosure

Well-structured with clear overview sections pointing to one-level-deep references (test_selection_guide.md, assumptions_and_diagnostics.md, etc.). Content is appropriately split between the main skill and reference files, with clear navigation signals throughout.

3 / 3

Total

10

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation13 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (632 lines); consider splitting into references/ and linking

Warning

description_voice

'description' should use third person voice; found second person: 'your '

Warning

metadata_version

'metadata.version' is missing

Warning

Total

13

/

16

Passed

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.