Guided statistical analysis with test selection and reporting. Use when you need help choosing appropriate tests for your data, assumption checking, power analysis, and APA-formatted results. Best for academic research reporting, test selection guidance. For implementing specific models programmatically use statsmodels.
Install with Tessl CLI
npx tessl i github:K-Dense-AI/claude-scientific-skills --skill statistical-analysisOverall
score
80%
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Discovery
85%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly articulates what the skill does and when to use it, with good disambiguation from related skills. The main weakness is the trigger term coverage, which could benefit from more common statistical test names and user-facing terminology that researchers would naturally use when seeking help.
Suggestions
Add common statistical test names as trigger terms (e.g., 't-test', 'ANOVA', 'chi-square', 'regression', 'correlation') that users would naturally mention when seeking statistical help.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'test selection', 'assumption checking', 'power analysis', and 'APA-formatted results'. Also specifies use cases like 'academic research reporting' and 'test selection guidance'. | 3 / 3 |
Completeness | Clearly answers both what (guided statistical analysis with test selection, assumption checking, power analysis, APA reporting) and when ('Use when you need help choosing appropriate tests', 'Best for academic research reporting'). Also includes helpful disambiguation from statsmodels. | 3 / 3 |
Trigger Term Quality | Includes some relevant keywords like 'statistical analysis', 'power analysis', 'APA-formatted', and 'academic research', but misses common user terms like 't-test', 'ANOVA', 'regression', 'p-value', or 'significance testing' that users would naturally say. | 2 / 3 |
Distinctiveness Conflict Risk | Clearly distinguishes itself with specific niche (guided analysis, APA formatting, academic reporting) and explicitly differentiates from programmatic implementation ('For implementing specific models programmatically use statsmodels'), reducing conflict risk. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
73%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a comprehensive statistical analysis skill with excellent actionability through executable code examples and good progressive disclosure via well-organized reference files. However, it suffers from verbosity—explaining concepts Claude already knows (effect size interpretation, p-value misconceptions, research best practices) and could benefit from tighter validation workflows with explicit checkpoints before proceeding to analysis.
Suggestions
Remove or drastically condense 'Best Practices' and 'Common Pitfalls' sections—these are general research methodology principles Claude already knows
Add explicit validation gates in the workflow: 'STOP: Do not proceed to analysis until assumption_checks.py returns all_passed=True'
Condense the 'When to Use Bayesian Methods' and effect size interpretation sections—focus on the decision criteria, not explanations of why these matter
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill contains significant redundancy and explanatory content Claude already knows (e.g., explaining what effect sizes are, why to use Bayesian methods, common pitfalls). While organized, it could be tightened substantially—the 'Best Practices' and 'Common Pitfalls' sections largely state obvious research methodology principles. | 2 / 3 |
Actionability | Provides fully executable Python code examples for t-tests, ANOVA, regression, and Bayesian analysis using real libraries (pingouin, statsmodels, pymc). Code is copy-paste ready with proper imports and complete workflows including diagnostic plots. | 3 / 3 |
Workflow Clarity | Includes a decision tree and checklist, but validation checkpoints are implicit rather than explicit. The workflow lacks clear 'stop and verify' steps—for example, assumption checking is described but not enforced as a gate before proceeding. Missing explicit feedback loops for when assumptions fail. | 2 / 3 |
Progressive Disclosure | Well-structured with clear overview sections pointing to one-level-deep references (test_selection_guide.md, assumptions_and_diagnostics.md, etc.). Content is appropriately split between the main skill and reference files, with clear navigation signals throughout. | 3 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 13 / 16 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (632 lines); consider splitting into references/ and linking | Warning |
description_voice | 'description' should use third person voice; found second person: 'your ' | Warning |
metadata_version | 'metadata.version' is missing | Warning |
Total | 13 / 16 Passed | |
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.