Statistical models library for Python. Use when you need specific model classes (OLS, GLM, mixed models, ARIMA) with detailed diagnostics, residuals, and inference. Best for econometrics, time series, rigorous inference with coefficient tables. For guided statistical test selection with APA reporting use statistical-analysis.
85
75%
Does it follow best practices?
Impact
93%
1.09xAverage score across 6 eval scenarios
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/statsmodels/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly communicates specific capabilities (model classes, diagnostics, inference), includes rich natural trigger terms that users in statistics would use, and explicitly disambiguates from a related skill. The description is concise yet comprehensive, covering what, when, and how it differs from similar skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions and model classes: OLS, GLM, mixed models, ARIMA, along with specific outputs like detailed diagnostics, residuals, inference, and coefficient tables. | 3 / 3 |
Completeness | Clearly answers both what ('Statistical models library for Python' with specific model classes and outputs) and when ('Use when you need specific model classes... Best for econometrics, time series, rigorous inference'). Also includes explicit disambiguation guidance pointing to statistical-analysis for a different use case. | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'OLS', 'GLM', 'mixed models', 'ARIMA', 'econometrics', 'time series', 'inference', 'coefficient tables', 'residuals', 'diagnostics'. These are terms a user working in statistics would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Clearly distinguishes itself from the related 'statistical-analysis' skill by specifying its niche (specific model classes, diagnostics, econometrics) versus guided test selection with APA reporting. The explicit disambiguation clause reduces conflict risk significantly. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
50%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides excellent executable code examples and covers statsmodels comprehensively, but is far too verbose for a SKILL.md file. Large portions catalog model types, distribution families, and diagnostic tests that Claude already knows, consuming significant token budget without adding actionable value. The content would benefit greatly from aggressive trimming of the catalog sections and pushing that detail into the referenced files.
Suggestions
Cut the 'Core Statistical Modeling Capabilities' section (sections 1-5) drastically — move the detailed model/family/test catalogs into the reference files and keep only a brief table or one-liner per category in SKILL.md
Reduce 'Common Pitfalls' from 15 items to the 3-5 most critical non-obvious ones (e.g., forgetting add_constant, Poisson overdispersion); Claude knows not to overfit or leak data
Remove the 'When to Use This Skill' bullet list — this duplicates the frontmatter description and is information Claude can infer from the skill content itself
Add explicit validation gates to workflows, e.g., 'If heteroskedasticity test p < 0.05, do NOT proceed with standard SEs — switch to robust SEs before interpreting coefficients'
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines. Extensive lists of model types, families, link functions, and features that Claude already knows. The 'When to Use This Skill' section, 'Core Statistical Modeling Capabilities' catalog, and 'Common Pitfalls' list of 15 items are largely redundant with Claude's existing knowledge. Much of this reads like library documentation rather than actionable skill guidance. | 1 / 3 |
Actionability | The Quick Start Guide provides fully executable, copy-paste ready code examples for OLS, Logistic Regression, ARIMA, and GLM. The Formula API section, Model Selection code, and Cross-Validation examples are all concrete and executable with proper imports and realistic usage patterns. | 3 / 3 |
Workflow Clarity | The four common workflows (Linear Regression, Binary Classification, Count Data, Time Series) list clear sequential steps and include implicit validation checkpoints (e.g., 'check residual diagnostics', 'test for stationarity'). However, they lack explicit validation gates (no 'only proceed when X passes' language) and no feedback loops for error recovery, which is important for statistical modeling where assumption violations require model changes. | 2 / 3 |
Progressive Disclosure | References to five detailed reference files are well-organized and clearly signaled at the bottom. However, the SKILL.md itself contains enormous amounts of inline content that should be in those reference files — the 'Core Statistical Modeling Capabilities' section alone is a massive catalog that duplicates what the references presumably cover. The overview should be much leaner with more content pushed to references. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (613 lines); consider splitting into references/ and linking | Warning |
metadata_version | 'metadata.version' is missing | Warning |
Total | 9 / 11 Passed | |
086de41
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.