Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.
85
83%
Does it follow best practices?
Impact
88%
1.49xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description with excellent specificity and domain-appropriate trigger terms that clearly define a Bayesian probabilistic programming niche. The main weakness is the absence of an explicit 'Use when...' clause, which would help Claude know exactly when to select this skill. The technical terminology is well-chosen and would naturally match user queries in this domain.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about Bayesian analysis, probabilistic modeling, PyMC, MCMC sampling, or statistical inference with uncertainty quantification.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Build hierarchical models', 'MCMC (NUTS)', 'variational inference', 'LOO/WAIC comparison', 'posterior checks'. These are concrete, well-defined capabilities in the Bayesian modeling domain. | 3 / 3 |
Completeness | The 'what' is well-covered with specific capabilities, but there is no explicit 'Use when...' clause or equivalent trigger guidance. The description only implies when it should be used through the listed capabilities. Per rubric guidelines, a missing 'Use when...' clause caps completeness at 2. | 2 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'Bayesian modeling', 'PyMC', 'hierarchical models', 'MCMC', 'NUTS', 'variational inference', 'LOO', 'WAIC', 'posterior checks', 'probabilistic programming', 'inference'. These cover the key terms a user working in this domain would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: Bayesian modeling specifically with PyMC. The technical terms like 'NUTS', 'LOO/WAIC', 'hierarchical models', and 'PyMC' are very specific and unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill for Bayesian modeling with PyMC. Its greatest strengths are the clear 8-step workflow with explicit validation checkpoints and the abundance of executable code examples. The main weakness is verbosity — some sections (distribution guide, common model patterns) could be trimmed or moved to reference files to reduce token consumption, and a few explanatory sentences state things Claude already knows.
Suggestions
Move the Distribution Selection Guide and Common Model Patterns sections to reference files (e.g., references/distributions.md and references/model_patterns.md) and replace with brief pointers, reducing the main skill by ~100 lines.
Remove mild explanatory text that Claude already knows, such as 'PyMC is a Python library for Bayesian modeling and probabilistic programming' and descriptions of what LOO/WAIC are, to improve token efficiency.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~400 lines) and includes some content Claude would already know (e.g., basic distribution descriptions, what LOO/WAIC are). The distribution selection guide and common model patterns sections, while useful, could be more concise or offloaded to reference files. However, most content is practical and not padded with unnecessary explanations. | 2 / 3 |
Actionability | The skill provides fully executable, copy-paste ready code examples throughout — from model building to sampling, diagnostics, predictions, and model comparison. Specific parameter values and concrete function calls are given rather than vague descriptions. | 3 / 3 |
Workflow Clarity | The 8-step standard Bayesian workflow is clearly sequenced with explicit validation checkpoints (prior predictive check before fitting, diagnostics check before interpretation, posterior predictive check for validation). Each step includes specific checks and remediation steps for common failures (divergences, low ESS, high R-hat). | 3 / 3 |
Progressive Disclosure | The skill provides a clear overview with well-signaled one-level-deep references to separate files: references/ (distributions.md, sampling_inference.md, workflows.md), scripts/ (model_diagnostics.py, model_comparison.py), and assets/ (templates). Each reference includes a brief description of when to use it. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (571 lines); consider splitting into references/ and linking | Warning |
metadata_version | 'metadata.version' is missing | Warning |
Total | 9 / 11 Passed | |
b58ad7e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.