Bayesian modeling with PyMC. Build hierarchical models, MCMC (NUTS), variational inference, LOO/WAIC comparison, posterior checks, for probabilistic programming and inference.
85
83%
Does it follow best practices?
Impact
88%
1.49xAverage score across 3 eval scenarios
Passed
No known issues
PyMC is a Python library for Bayesian modeling and probabilistic programming. Build, fit, validate, and compare Bayesian models using PyMC's modern API (version 5.x+), including hierarchical models, MCMC sampling (NUTS), variational inference, and model comparison (LOO, WAIC).
This skill should be used when:
Follow this workflow for building and validating Bayesian models:
import pymc as pm
import arviz as az
import numpy as np
# Load and prepare data
X = ... # Predictors
y = ... # Outcomes
# Standardize predictors for better sampling
X_mean = X.mean(axis=0)
X_std = X.std(axis=0)
X_scaled = (X - X_mean) / X_stdKey practices:
coords for claritycoords = {
'predictors': ['var1', 'var2', 'var3'],
'obs_id': np.arange(len(y))
}
with pm.Model(coords=coords) as model:
# Priors
alpha = pm.Normal('alpha', mu=0, sigma=1)
beta = pm.Normal('beta', mu=0, sigma=1, dims='predictors')
sigma = pm.HalfNormal('sigma', sigma=1)
# Linear predictor
mu = alpha + pm.math.dot(X_scaled, beta)
# Likelihood
y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y, dims='obs_id')Key practices:
HalfNormal or Exponential for scale parametersdims) instead of shape when possiblepm.Data() for values that will be updated for predictionsAlways validate priors before fitting:
with model:
prior_pred = pm.sample_prior_predictive(samples=1000, random_seed=42)
# Visualize
az.plot_ppc(prior_pred, group='prior')Check:
with model:
# Optional: Quick exploration with ADVI
# approx = pm.fit(n=20000)
# Full MCMC inference
idata = pm.sample(
draws=2000,
tune=1000,
chains=4,
target_accept=0.9,
random_seed=42,
idata_kwargs={'log_likelihood': True} # For model comparison
)Key parameters:
draws=2000: Number of samples per chaintune=1000: Warmup samples (discarded)chains=4: Run 4 chains for convergence checkingtarget_accept=0.9: Higher for difficult posteriors (0.95-0.99)log_likelihood=True for model comparisonUse the diagnostic script:
from scripts.model_diagnostics import check_diagnostics
results = check_diagnostics(idata, var_names=['alpha', 'beta', 'sigma'])Check:
If issues arise:
target_accept=0.95, use non-centered parameterizationValidate model fit:
with model:
pm.sample_posterior_predictive(idata, extend_inferencedata=True, random_seed=42)
# Visualize
az.plot_ppc(idata)Check:
# Summary statistics
print(az.summary(idata, var_names=['alpha', 'beta', 'sigma']))
# Posterior distributions
az.plot_posterior(idata, var_names=['alpha', 'beta', 'sigma'])
# Coefficient estimates
az.plot_forest(idata, var_names=['beta'], combined=True)X_new = ... # New predictor values
X_new_scaled = (X_new - X_mean) / X_std
with model:
pm.set_data({'X_scaled': X_new_scaled})
post_pred = pm.sample_posterior_predictive(
idata.posterior,
var_names=['y_obs'],
random_seed=42
)
# Extract prediction intervals
y_pred_mean = post_pred.posterior_predictive['y_obs'].mean(dim=['chain', 'draw'])
y_pred_hdi = az.hdi(post_pred.posterior_predictive, var_names=['y_obs'])For continuous outcomes with linear relationships:
with pm.Model() as linear_model:
alpha = pm.Normal('alpha', mu=0, sigma=10)
beta = pm.Normal('beta', mu=0, sigma=10, shape=n_predictors)
sigma = pm.HalfNormal('sigma', sigma=1)
mu = alpha + pm.math.dot(X, beta)
y = pm.Normal('y', mu=mu, sigma=sigma, observed=y_obs)Use template: assets/linear_regression_template.py
For binary outcomes:
with pm.Model() as logistic_model:
alpha = pm.Normal('alpha', mu=0, sigma=10)
beta = pm.Normal('beta', mu=0, sigma=10, shape=n_predictors)
logit_p = alpha + pm.math.dot(X, beta)
y = pm.Bernoulli('y', logit_p=logit_p, observed=y_obs)For grouped data (use non-centered parameterization):
with pm.Model(coords={'groups': group_names}) as hierarchical_model:
# Hyperpriors
mu_alpha = pm.Normal('mu_alpha', mu=0, sigma=10)
sigma_alpha = pm.HalfNormal('sigma_alpha', sigma=1)
# Group-level (non-centered)
alpha_offset = pm.Normal('alpha_offset', mu=0, sigma=1, dims='groups')
alpha = pm.Deterministic('alpha', mu_alpha + sigma_alpha * alpha_offset, dims='groups')
# Observation-level
mu = alpha[group_idx]
sigma = pm.HalfNormal('sigma', sigma=1)
y = pm.Normal('y', mu=mu, sigma=sigma, observed=y_obs)Use template: assets/hierarchical_model_template.py
Critical: Always use non-centered parameterization for hierarchical models to avoid divergences.
For count data:
with pm.Model() as poisson_model:
alpha = pm.Normal('alpha', mu=0, sigma=10)
beta = pm.Normal('beta', mu=0, sigma=10, shape=n_predictors)
log_lambda = alpha + pm.math.dot(X, beta)
y = pm.Poisson('y', mu=pm.math.exp(log_lambda), observed=y_obs)For overdispersed counts, use NegativeBinomial instead.
For autoregressive processes:
with pm.Model() as ar_model:
sigma = pm.HalfNormal('sigma', sigma=1)
rho = pm.Normal('rho', mu=0, sigma=0.5, shape=ar_order)
init_dist = pm.Normal.dist(mu=0, sigma=sigma)
y = pm.AR('y', rho=rho, sigma=sigma, init_dist=init_dist, observed=y_obs)Use LOO or WAIC for model comparison:
from scripts.model_comparison import compare_models, check_loo_reliability
# Fit models with log_likelihood
models = {
'Model1': idata1,
'Model2': idata2,
'Model3': idata3
}
# Compare using LOO
comparison = compare_models(models, ic='loo')
# Check reliability
check_loo_reliability(models)Interpretation:
Check Pareto-k values:
When models are similar, average predictions:
from scripts.model_comparison import model_averaging
averaged_pred, weights = model_averaging(models, var_name='y_obs')Scale parameters (σ, τ):
pm.HalfNormal('sigma', sigma=1) - Default choicepm.Exponential('sigma', lam=1) - Alternativepm.Gamma('sigma', alpha=2, beta=1) - More informativeUnbounded parameters:
pm.Normal('theta', mu=0, sigma=1) - For standardized datapm.StudentT('theta', nu=3, mu=0, sigma=1) - Robust to outliersPositive parameters:
pm.LogNormal('theta', mu=0, sigma=1)pm.Gamma('theta', alpha=2, beta=1)Probabilities:
pm.Beta('p', alpha=2, beta=2) - Weakly informativepm.Uniform('p', lower=0, upper=1) - Non-informative (use sparingly)Correlation matrices:
pm.LKJCorr('corr', n=n_vars, eta=2) - eta=1 uniform, eta>1 prefers identityContinuous outcomes:
pm.Normal('y', mu=mu, sigma=sigma) - Default for continuous datapm.StudentT('y', nu=nu, mu=mu, sigma=sigma) - Robust to outliersCount data:
pm.Poisson('y', mu=lambda) - Equidispersed countspm.NegativeBinomial('y', mu=mu, alpha=alpha) - Overdispersed countspm.ZeroInflatedPoisson('y', psi=psi, mu=mu) - Excess zerosBinary outcomes:
pm.Bernoulli('y', p=p) or pm.Bernoulli('y', logit_p=logit_p)Categorical outcomes:
pm.Categorical('y', p=probs)See: references/distributions.md for comprehensive distribution reference
Default and recommended for most models:
idata = pm.sample(
draws=2000,
tune=1000,
chains=4,
target_accept=0.9,
random_seed=42
)Adjust when needed:
target_accept=0.95 or higherpm.Metropolis() for discrete varsFast approximation for exploration or initialization:
with model:
approx = pm.fit(n=20000, method='advi')
# Use for initialization
start = approx.sample(return_inferencedata=False)[0]
idata = pm.sample(start=start)Trade-offs:
See: references/sampling_inference.md for detailed sampling guide
from scripts.model_diagnostics import create_diagnostic_report
create_diagnostic_report(
idata,
var_names=['alpha', 'beta', 'sigma'],
output_dir='diagnostics/'
)Creates:
from scripts.model_diagnostics import check_diagnostics
results = check_diagnostics(idata)Checks R-hat, ESS, divergences, and tree depth.
Symptom: idata.sample_stats.diverging.sum() > 0
Solutions:
target_accept=0.95 or 0.99Symptom: ESS < 400
Solutions:
draws=5000Symptom: R-hat > 1.01
Solutions:
tune=2000, draws=5000Solutions:
cores=8, chains=8dims) for claritytarget_accept=0.9 as baseline (higher if needed)log_likelihood=True for model comparisonThis skill includes:
references/)distributions.md: Comprehensive catalog of PyMC distributions organized by category (continuous, discrete, multivariate, mixture, time series). Use when selecting priors or likelihoods.
sampling_inference.md: Detailed guide to sampling algorithms (NUTS, Metropolis, SMC), variational inference (ADVI, SVGD), and handling sampling issues. Use when encountering convergence problems or choosing inference methods.
workflows.md: Complete workflow examples and code patterns for common model types, data preparation, prior selection, and model validation. Use as a cookbook for standard Bayesian analyses.
scripts/)model_diagnostics.py: Automated diagnostic checking and report generation. Functions: check_diagnostics() for quick checks, create_diagnostic_report() for comprehensive analysis with plots.
model_comparison.py: Model comparison utilities using LOO/WAIC. Functions: compare_models(), check_loo_reliability(), model_averaging().
assets/)linear_regression_template.py: Complete template for Bayesian linear regression with full workflow (data prep, prior checks, fitting, diagnostics, predictions).
hierarchical_model_template.py: Complete template for hierarchical/multilevel models with non-centered parameterization and group-level analysis.
with pm.Model(coords={'var': names}) as model:
# Priors
param = pm.Normal('param', mu=0, sigma=1, dims='var')
# Likelihood
y = pm.Normal('y', mu=..., sigma=..., observed=data)idata = pm.sample(draws=2000, tune=1000, chains=4, target_accept=0.9)from scripts.model_diagnostics import check_diagnostics
check_diagnostics(idata)from scripts.model_comparison import compare_models
compare_models({'m1': idata1, 'm2': idata2}, ic='loo')with model:
pm.set_data({'X': X_new})
pred = pm.sample_posterior_predictive(idata.posterior)pm.model_to_graphviz(model) to visualize model structureidata.to_netcdf('results.nc')az.from_netcdf('results.nc')b58ad7e
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.