0
# CmdStanPy
1
2
CmdStanPy is a lightweight pure-Python interface to CmdStan that provides comprehensive access to the Stan compiler and all Bayesian inference algorithms. It serves as a clean, minimal-dependency bridge between Python data science workflows and the powerful Stan probabilistic programming language, enabling users to compile Stan models and execute MCMC sampling, variational inference, and optimization algorithms.
3
4
## Package Information
5
6
- **Package Name**: cmdstanpy
7
- **Language**: Python
8
- **Installation**: `pip install cmdstanpy`
9
- **License**: BSD-3-Clause
10
- **Documentation**: https://mc-stan.org/cmdstanpy
11
12
## Core Imports
13
14
```python
15
import cmdstanpy
16
```
17
18
Common pattern for model-based inference:
19
20
```python
21
from cmdstanpy import CmdStanModel
22
```
23
24
For working with existing fits from CSV files:
25
26
```python
27
from cmdstanpy import from_csv
28
```
29
30
## Basic Usage
31
32
```python
33
import cmdstanpy as csp
34
from cmdstanpy import CmdStanModel
35
import numpy as np
36
37
# Set up CmdStan (one-time setup)
38
csp.install_cmdstan() # Download and install CmdStan
39
csp.set_cmdstan_path("/path/to/cmdstan") # Or set path if already installed
40
41
# Create a simple Stan model
42
stan_code = """
43
data {
44
int<lower=0> N;
45
vector[N] x;
46
vector[N] y;
47
}
48
parameters {
49
real alpha;
50
real beta;
51
real<lower=0> sigma;
52
}
53
model {
54
y ~ normal(alpha + beta * x, sigma);
55
}
56
"""
57
58
# Write Stan code to file and compile
59
with open("linear_regression.stan", "w") as f:
60
f.write(stan_code)
61
62
model = CmdStanModel(stan_file="linear_regression.stan")
63
64
# Generate some sample data
65
N = 100
66
x = np.random.normal(0, 1, N)
67
y = 2 + 3 * x + np.random.normal(0, 0.5, N)
68
69
data = {"N": N, "x": x, "y": y}
70
71
# Run MCMC sampling
72
fit = model.sample(data=data, chains=4, iter_sampling=1000, iter_warmup=1000)
73
74
# Access results
75
print(fit.summary())
76
print("Alpha mean:", fit.stan_variable("alpha").mean())
77
print("Beta mean:", fit.stan_variable("beta").mean())
78
79
# Save results
80
fit.save_csvfiles(dir="./results")
81
```
82
83
## Architecture
84
85
CmdStanPy follows a clean separation between model compilation, inference execution, and results handling:
86
87
- **CmdStanModel**: Encapsulates Stan program compilation and provides methods for different inference algorithms
88
- **Fit Objects**: Specialized containers (CmdStanMCMC, CmdStanMLE, etc.) that provide access to inference results through multiple data formats
89
- **Utilities**: Installation, configuration, and data formatting functions that handle the interface to CmdStan
90
- **CSV Interoperability**: All results can be saved/loaded as Stan-format CSV files for reproducibility and external analysis
91
92
This design enables both interactive analysis and production workflows while maintaining compatibility with the broader Stan ecosystem.
93
94
## Capabilities
95
96
### Installation and Setup
97
98
Functions for installing, configuring, and managing the CmdStan installation that CmdStanPy depends on.
99
100
```python { .api }
101
def install_cmdstan(version=None, dir=None, overwrite=False, compiler=False, progress=False, verbose=False, cores=1, interactive=False): ...
102
def set_cmdstan_path(path): ...
103
def cmdstan_path(): ...
104
def cmdstan_version(): ...
105
def set_make_env(make_env): ...
106
def rebuild_cmdstan(verbose=False, progress=True, cores=1): ...
107
```
108
109
[Installation and Setup](./installation-setup.md)
110
111
### Model Compilation
112
113
Functions for compiling and formatting Stan programs, with support for custom compiler options and automatic dependency management.
114
115
```python { .api }
116
def compile_stan_file(src, force=False, stanc_options=None, cpp_options=None, user_header=None): ...
117
def format_stan_file(stan_file, overwrite_file=False, canonicalize=False, max_line_length=78, backup=True, stanc_options=None): ...
118
```
119
120
[Model Compilation](./model-compilation.md)
121
122
### Model Interface
123
124
The core CmdStanModel class that handles Stan program compilation and provides methods for all inference algorithms.
125
126
```python { .api }
127
class CmdStanModel:
128
def __init__(self, model_name=None, stan_file=None, exe_file=None, force_compile=False, stanc_options=None, cpp_options=None, user_header=None, compile=None): ...
129
def sample(self, data=None, chains=4, parallel_chains=None, threads_per_chain=None, seed=None, inits=None, iter_warmup=1000, iter_sampling=1000, save_warmup=False, thin=1, max_treedepth=10, metric=None, step_size=None, adapt_engaged=True, adapt_delta=0.8, adapt_init_phase=15, adapt_metric_window=25, adapt_step_size=50, fixed_param=False, output_dir=None, sig_figs=None, validate_csv=True, show_console=False, refresh=None, time_fmt=None, timeout=None): ...
130
def optimize(self, data=None, seed=None, inits=None, algorithm=None, iter=2000, jacobian=False, output_dir=None, sig_figs=None, show_console=False, refresh=None, time_fmt=None, timeout=None): ...
131
def variational(self, data=None, seed=None, inits=None, algorithm=None, iter=10000, grad_samples=1, elbo_samples=100, eta=1.0, adapt_engaged=True, adapt_iter=50, tol_rel_obj=0.01, eval_elbo=100, draws=1000, output_dir=None, sig_figs=None, show_console=False, refresh=None, time_fmt=None, timeout=None): ...
132
def pathfinder(self, data=None, seed=None, inits=None, num_paths=4, draws=1000, psis_resample=True, calculate_lp=True, max_lbfgs_iters=1000, num_draws=None, save_single_paths=False, output_dir=None, sig_figs=None, show_console=False, refresh=None, time_fmt=None, timeout=None): ...
133
def laplace_sample(self, data=None, mode=None, draws=1000, jacobian=True, refresh=100, output_dir=None, sig_figs=None, show_console=False, time_fmt=None, timeout=None): ...
134
def generate_quantities(self, data=None, previous_fit=None, seed=None, parallel_chains=None, output_dir=None, sig_figs=None, show_console=False, refresh=None, time_fmt=None, timeout=None): ...
135
def log_prob(self, params, data=None, jacobian=True, sig_figs=None): ...
136
```
137
138
[Model Interface](./model-interface.md)
139
140
### MCMC Sampling Results
141
142
Container for Markov Chain Monte Carlo sampling results with comprehensive diagnostics and multiple data access formats.
143
144
```python { .api }
145
class CmdStanMCMC:
146
def draws(self, inc_warmup=False, concat_chains=False): ...
147
def draws_pd(self, vars=None, inc_warmup=False): ...
148
def draws_xr(self, vars=None, inc_warmup=False): ...
149
def stan_variable(self, var, inc_warmup=False): ...
150
def stan_variables(self): ...
151
def method_variables(self): ...
152
def summary(self, percentiles=None, sig_figs=None): ...
153
def diagnose(self): ...
154
def save_csvfiles(self, dir=None): ...
155
```
156
157
[MCMC Sampling Results](./mcmc-results.md)
158
159
### Optimization Results
160
161
Container for maximum likelihood and maximum a posteriori estimation results.
162
163
```python { .api }
164
class CmdStanMLE:
165
def optimized_params_np(self): ...
166
def optimized_params_pd(self): ...
167
def optimized_params_dict(self): ...
168
def optimized_iterations_np(self): ...
169
def optimized_iterations_pd(self): ...
170
def stan_variable(self, var): ...
171
def stan_variables(self): ...
172
def save_csvfiles(self, dir=None): ...
173
```
174
175
[Optimization Results](./optimization-results.md)
176
177
### Variational Inference Results
178
179
Container for Automatic Differentiation Variational Inference (ADVI) results and approximate posterior samples.
180
181
```python { .api }
182
class CmdStanVB:
183
def variational_params_np(self): ...
184
def variational_params_pd(self): ...
185
def variational_params_dict(self): ...
186
def variational_sample(self): ...
187
def variational_sample_pd(self): ...
188
def stan_variable(self, var): ...
189
def stan_variables(self): ...
190
def save_csvfiles(self, dir=None): ...
191
```
192
193
[Variational Inference Results](./variational-results.md)
194
195
### Pathfinder and Laplace Results
196
197
Containers for advanced variational approximation methods including Pathfinder algorithm and Laplace approximation.
198
199
```python { .api }
200
class CmdStanPathfinder:
201
def draws(self): ...
202
def stan_variable(self, var): ...
203
def stan_variables(self): ...
204
def method_variables(self): ...
205
def create_inits(self, seed=None, chains=4): ...
206
def save_csvfiles(self, dir=None): ...
207
208
class CmdStanLaplace:
209
def draws(self): ...
210
def draws_pd(self, vars=None): ...
211
def draws_xr(self, vars=None): ...
212
def stan_variable(self, var): ...
213
def stan_variables(self): ...
214
def method_variables(self): ...
215
def save_csvfiles(self, dir=None): ...
216
```
217
218
[Advanced Variational Methods](./advanced-variational.md)
219
220
### Generated Quantities
221
222
Container for generated quantities computed from existing fit results, enabling post-processing and derived quantity calculation.
223
224
```python { .api }
225
class CmdStanGQ:
226
def draws(self, concat_chains=True): ...
227
def draws_pd(self, vars=None): ...
228
def draws_xr(self, vars=None): ...
229
def stan_variable(self, var): ...
230
def stan_variables(self): ...
231
def save_csvfiles(self, dir=None): ...
232
```
233
234
[Generated Quantities](./generated-quantities.md)
235
236
### Data and I/O Utilities
237
238
Functions for data formatting, CSV file handling, and interoperability with the Stan ecosystem.
239
240
```python { .api }
241
def write_stan_json(data, filename=None): ...
242
def from_csv(path=None, method=None): ...
243
def show_versions(output=True): ...
244
```
245
246
[Data and I/O Utilities](./data-io-utilities.md)