0
# Utilities and File I/O
1
2
Essential utilities for data processing, file operations, and integration with other tools. UltraNest provides comprehensive support functions for nested sampling workflows, data manipulation, and compatibility with external analysis packages.
3
4
## Capabilities
5
6
### File Input/Output
7
8
Functions for reading and writing nested sampling results and intermediate data.
9
10
```python { .api }
11
def read_file(
12
log_dir: str,
13
x_dim: int,
14
num_bootstraps: int = 20,
15
random: bool = True,
16
verbose: bool = False,
17
check_insertion_order: bool = True
18
):
19
"""
20
Read UltraNest output files from a completed run.
21
22
Parameters:
23
- log_dir (str): Directory containing output files
24
- x_dim (int): Dimensionality of parameter space
25
- num_bootstraps (int): Number of bootstraps for estimating logZ
26
- random (bool): Use randomization for volume estimation
27
- verbose (bool): Show progress during reading
28
- check_insertion_order (bool): Perform MWW insertion order test for convergence assessment
29
30
Returns:
31
dict: Results dictionary containing:
32
- 'samples': Parameter samples
33
- 'loglikelihood': Log-likelihood values
34
- 'weights': Sample weights
35
- 'logz': Evidence estimate
36
- 'logzerr': Evidence uncertainty
37
- 'information': Information content
38
- 'posterior': Posterior statistics
39
"""
40
```
41
42
#### File Reading Usage
43
44
```python
45
from ultranest import read_file
46
47
# Read results from previous run
48
results = read_file(
49
log_dir='logs/my_analysis/',
50
x_dim=3, # Three parameters
51
num_bootstraps=20,
52
verbose=True
53
)
54
55
print(f"Evidence: {results['logz']:.2f} ± {results['logzerr']:.2f}")
56
print(f"Information: {results['information']:.2f} nats")
57
```
58
59
### Warm Starting
60
61
Resume sampling from similar previous runs to improve efficiency.
62
63
```python { .api }
64
def warmstart_from_similar_file(
65
usample_filename: str,
66
param_names: list,
67
loglike: callable,
68
transform: callable,
69
vectorized: bool = False,
70
min_num_samples: int = 50
71
):
72
"""
73
Initialize sampling using similar previous run as starting point.
74
75
Parameters:
76
- usample_filename (str): Path to file containing weighted posterior samples
77
- param_names (list): Names of parameters being sampled
78
- loglike (callable): New log-likelihood function
79
- transform (callable): New transform function
80
- vectorized (bool): Whether functions accept multiple points
81
- min_num_samples (int): Minimum number of samples required
82
83
Returns:
84
tuple: (aux_param_names, aux_loglike, aux_transform, vectorized)
85
Components for auxiliary sampler initialization
86
"""
87
```
88
89
#### Warm Start Usage
90
91
```python
92
from ultranest import warmstart_from_similar_file, ReactiveNestedSampler
93
94
# Warm start from previous run
95
aux_paramnames, aux_loglike, aux_transform, vectorized = warmstart_from_similar_file(
96
'previous_run/chains/weighted_post_untransformed.txt',
97
param_names,
98
new_loglike,
99
new_transform
100
)
101
102
# Create auxiliary sampler
103
aux_sampler = ReactiveNestedSampler(
104
aux_paramnames,
105
aux_loglike,
106
transform=aux_transform,
107
vectorized=vectorized
108
)
109
110
# Run auxiliary sampling for warm start
111
aux_results = aux_sampler.run()
112
```
113
114
### Function Vectorization
115
116
Utilities for optimizing likelihood function performance through vectorization.
117
118
```python { .api }
119
def vectorize(function: callable):
120
"""
121
Vectorize user functions for improved performance with multiple evaluations.
122
123
Parameters:
124
- function (callable): Function to vectorize, should accept single parameter array
125
126
Returns:
127
callable: Vectorized function that accepts multiple parameter arrays
128
129
Usage:
130
Original function: f(theta) -> scalar
131
Vectorized function: f_vec(theta_array) -> array
132
"""
133
```
134
135
#### Vectorization Usage
136
137
```python
138
from ultranest import vectorize
139
import numpy as np
140
141
# Original likelihood function
142
def loglike(theta):
143
x, y, z = theta
144
return -0.5 * (x**2 + y**2 + z**2)
145
146
# Vectorize for performance
147
vectorized_loglike = vectorize(loglike)
148
149
# Can now process multiple points efficiently
150
theta_array = np.random.randn(100, 3) # 100 parameter sets
151
loglike_values = vectorized_loglike(theta_array) # 100 likelihood values
152
```
153
154
### Logging and Directory Management
155
156
Tools for organizing analysis runs and managing output files.
157
158
```python { .api }
159
def create_logger(
160
module_name: str,
161
log_dir: str = None,
162
level=logging.INFO
163
):
164
"""
165
Create logger for UltraNest analysis with appropriate formatting.
166
167
Parameters:
168
- module_name (str): Name of module for logger identification
169
- log_dir (str, optional): Directory for log files
170
- level: Logging level (DEBUG, INFO, WARNING, ERROR)
171
172
Returns:
173
logging.Logger: Configured logger instance
174
"""
175
176
def make_run_dir(
177
log_dir: str,
178
run_num: int = None,
179
**kwargs
180
):
181
"""
182
Create directory structure for analysis run with appropriate naming.
183
184
Parameters:
185
- log_dir (str): Base directory for runs
186
- run_num (int, optional): Specific run number
187
- **kwargs: Additional directory creation options
188
189
Returns:
190
str: Path to created run directory
191
"""
192
```
193
194
#### Logging Usage
195
196
```python
197
import logging
198
from ultranest.utils import create_logger, make_run_dir
199
200
# Set up logging for analysis
201
logger = create_logger('my_analysis', log_dir='logs/', level=logging.INFO)
202
203
# Create organized run directory
204
run_dir = make_run_dir('logs/', run_num=1)
205
logger.info(f"Created run directory: {run_dir}")
206
207
# Use logger throughout analysis
208
logger.info("Starting nested sampling analysis")
209
logger.warning("Convergence slow, consider increasing live points")
210
```
211
212
### Statistical Utilities
213
214
Functions for statistical analysis and data manipulation of nested sampling results.
215
216
```python { .api }
217
def resample_equal(samples, weights, rstate=None):
218
"""
219
Resample weighted samples to create equal-weight posterior samples.
220
221
Parameters:
222
- samples (array): Weighted samples, shape (n_samples, n_params)
223
- weights (array): Sample weights, shape (n_samples,)
224
- rstate (RandomState, optional): Random number generator state
225
226
Returns:
227
array: Equal-weight samples
228
"""
229
230
def quantile(x, q, weights=None):
231
"""
232
Compute weighted quantiles from sample array.
233
234
Parameters:
235
- x (array): Sample values
236
- q (float or array): Quantile(s) to compute (0-1)
237
- weights (array, optional): Sample weights
238
239
Returns:
240
float or array: Quantile value(s)
241
"""
242
```
243
244
#### Statistical Analysis Usage
245
246
```python
247
import numpy as np
248
from ultranest.utils import resample_equal, quantile
249
250
# Extract weighted samples from results
251
samples = results['samples']
252
weights = results['weights']
253
254
# Create equal-weight samples for analysis
255
equal_samples = resample_equal(samples, weights)
256
257
# Compute statistics
258
for i, param_name in enumerate(['x', 'y', 'z']):
259
param_samples = samples[:, i]
260
261
# Weighted quantiles
262
median = quantile(param_samples, 0.5, weights=weights)
263
q16 = quantile(param_samples, 0.16, weights=weights)
264
q84 = quantile(param_samples, 0.84, weights=weights)
265
266
print(f"{param_name}: {median:.3f} +{q84-median:.3f} -{median-q16:.3f}")
267
```
268
269
### Mathematical Utilities
270
271
Mathematical functions for nested sampling analysis and geometric calculations.
272
273
```python { .api }
274
def vol_prefactor(n: int):
275
"""
276
Volume prefactor for n-dimensional unit sphere.
277
278
Parameters:
279
- n (int): Dimensionality
280
281
Returns:
282
float: Volume prefactor V_n = π^(n/2) / Γ(n/2 + 1)
283
"""
284
285
def is_affine_transform(a, b):
286
"""
287
Check if transformation from a to b is affine.
288
289
Parameters:
290
- a (array): Input points
291
- b (array): Transformed points
292
293
Returns:
294
bool: True if transformation is affine
295
"""
296
297
def normalised_kendall_tau_distance(**kwargs):
298
"""
299
Compute normalized Kendall tau distance for rank correlation analysis.
300
301
Returns:
302
float: Normalized Kendall tau distance
303
"""
304
```
305
306
### Validation and Testing
307
308
Tools for validating implementations and testing numerical accuracy.
309
310
```python { .api }
311
def verify_gradient(**kwargs):
312
"""
313
Verify gradient implementations using finite differences.
314
315
Useful for validating custom likelihood gradients for HMC sampling.
316
317
Returns:
318
dict: Validation results and error estimates
319
"""
320
```
321
322
### Parallel Processing
323
324
Utilities for distributed computing and parallel processing workflows.
325
326
```python { .api }
327
def distributed_work_chunk_size(**kwargs):
328
"""
329
Determine optimal work chunk size for distributed MPI processing.
330
331
Returns:
332
int: Recommended chunk size for current MPI configuration
333
"""
334
335
def listify(*args):
336
"""
337
Convert arguments to lists for consistent processing.
338
339
Parameters:
340
- *args: Arguments to convert to list format
341
342
Returns:
343
list: Arguments converted to list format
344
"""
345
```
346
347
### Storage Backends
348
349
Classes for data persistence and file management during nested sampling runs.
350
351
```python { .api }
352
class NullPointStore:
353
"""
354
No-op storage implementation for testing and benchmarking.
355
356
Provides storage interface without actual file operations,
357
useful for performance testing and dry runs.
358
"""
359
360
class TextPointStore:
361
"""
362
Text-based storage using CSV/TSV formats.
363
364
Human-readable storage format suitable for small analyses
365
and debugging. Uses tab-separated or comma-separated values.
366
"""
367
368
class HDF5PointStore:
369
"""
370
HDF5-based storage for high-performance applications.
371
372
Recommended storage backend for production use. Provides
373
efficient binary storage with compression and fast access.
374
"""
375
```
376
377
### Compatibility Layer
378
379
Functions for integration with other nested sampling and Bayesian analysis tools.
380
381
```python { .api }
382
def pymultinest_solve_compat(**kwargs):
383
"""
384
Drop-in replacement for PyMultiNest's solve function.
385
386
Provides compatibility interface for existing PyMultiNest workflows,
387
allowing easy migration to UltraNest with minimal code changes.
388
389
Parameters:
390
Similar to PyMultiNest.solve() interface
391
392
Returns:
393
Results in PyMultiNest-compatible format
394
"""
395
```
396
397
#### Compatibility Usage
398
399
```python
400
from ultranest.solvecompat import pymultinest_solve_compat
401
402
# Drop-in replacement for existing PyMultiNest code
403
# pymultinest.solve(...) becomes:
404
result = pymultinest_solve_compat(
405
LogLikelihood=loglike,
406
Prior=prior_transform,
407
n_dims=3,
408
n_live_points=1000,
409
outputfiles_basename='chains/analysis_'
410
)
411
```
412
413
## Advanced Utility Usage
414
415
### Custom Analysis Pipeline
416
417
```python
418
import logging
419
from ultranest import ReactiveNestedSampler
420
from ultranest.utils import (
421
create_logger, make_run_dir, vectorize,
422
resample_equal, quantile
423
)
424
425
# Set up analysis infrastructure
426
logger = create_logger('bayesian_analysis', level=logging.INFO)
427
run_dir = make_run_dir('analyses/', run_num=None) # Auto-increment
428
429
# Optimize likelihood function
430
@vectorize
431
def optimized_loglike(theta):
432
"""Vectorized likelihood for better performance"""
433
# Your likelihood calculation
434
return -0.5 * np.sum(theta**2, axis=-1)
435
436
# Run analysis
437
logger.info("Starting nested sampling")
438
sampler = ReactiveNestedSampler(
439
param_names=['x', 'y', 'z'],
440
loglike=optimized_loglike,
441
transform=prior_transform,
442
log_dir=run_dir,
443
vectorized=True
444
)
445
446
results = sampler.run()
447
logger.info(f"Analysis complete. Evidence: {results['logz']:.2f}")
448
449
# Post-processing
450
samples = results['samples']
451
weights = results['weights']
452
453
# Statistical analysis
454
equal_samples = resample_equal(samples, weights)
455
medians = [quantile(equal_samples[:, i], 0.5) for i in range(3)]
456
logger.info(f"Parameter medians: {medians}")
457
458
# Save processed results
459
import pickle
460
with open(f"{run_dir}/processed_results.pkl", 'wb') as f:
461
pickle.dump({
462
'results': results,
463
'equal_samples': equal_samples,
464
'medians': medians
465
}, f)
466
```
467
468
### Integration with External Tools
469
470
```python
471
# Convert to getdist format (for plotting with getdist)
472
try:
473
import getdist
474
from getdist import MCSamples
475
476
# Create getdist samples object
477
gd_samples = MCSamples(
478
samples=equal_samples,
479
names=['x', 'y', 'z'],
480
labels=['X', 'Y', 'Z']
481
)
482
483
# Use getdist plotting
484
g = getdist.plots.get_subplot_plotter()
485
g.triangle_plot(gd_samples, filled=True)
486
487
except ImportError:
488
logger.warning("getdist not available, using UltraNest plotting")
489
from ultranest.plot import cornerplot
490
cornerplot(results)
491
```
492
493
### Memory and Performance Optimization
494
495
```python
496
import numpy as np
497
from ultranest.utils import vol_prefactor
498
499
# Optimize for high-dimensional problems
500
n_dims = 50
501
logger.info(f"N-sphere volume prefactor: {vol_prefactor(n_dims)}")
502
503
# Use appropriate data types for memory efficiency
504
samples_float32 = results['samples'].astype(np.float32)
505
logger.info(f"Memory saved: {samples.nbytes - samples_float32.nbytes} bytes")
506
507
# Chunk processing for large datasets
508
chunk_size = 10000
509
n_samples = len(equal_samples)
510
511
for i in range(0, n_samples, chunk_size):
512
chunk = equal_samples[i:i+chunk_size]
513
# Process chunk
514
logger.info(f"Processed chunk {i//chunk_size + 1}/{(n_samples-1)//chunk_size + 1}")
515
```
516
517
The utilities module provides essential infrastructure for robust nested sampling workflows, from basic file operations to advanced statistical analysis and integration with external tools.