Multi-dimensional data arrays with labeled dimensions for scientific computing
npx @tessl/cli install tessl/pypi-scipp@24.11.00
# Scipp
1
2
Scipp is a comprehensive Python library for multi-dimensional data arrays with labeled dimensions, designed for scientific computing with built-in physical units, uncertainty propagation, and advanced data manipulation capabilities. It provides a powerful foundation for neutron scattering data analysis and other scientific applications requiring efficient manipulation of complex, multi-dimensional datasets with proper error propagation and metadata handling.
3
4
## Package Information
5
6
- **Package Name**: scipp
7
- **Language**: Python
8
- **Installation**: `pip install scipp`
9
- **Optional Dependencies**: `pip install scipp[all]` for complete functionality
10
- **Documentation**: https://scipp.github.io/
11
- **Source**: https://github.com/scipp/scipp
12
13
## Core Imports
14
15
```python
16
import scipp as sc
17
```
18
19
Alternative imports for specific modules:
20
21
```python
22
import scipp.units as units
23
import scipp.spatial as spatial
24
import scipp.data as data
25
import scipp.io as io
26
```
27
28
## Basic Usage
29
30
```python
31
import scipp as sc
32
import numpy as np
33
34
# Create a Variable with physical units
35
data = sc.array(dims=['x'], values=[1, 2, 3, 4, 5], unit='m')
36
print(data)
37
38
# Create a DataArray with coordinates
39
x_coord = sc.linspace(dim='x', start=0.0, stop=10.0, num=5, unit='mm')
40
da = sc.DataArray(data=data, coords={'x': x_coord})
41
print(da)
42
43
# Perform operations that preserve units and propagate uncertainties
44
result = da * 2 # Units are preserved
45
print(result)
46
47
# Create data with uncertainties
48
uncertain_data = sc.array(dims=['x'], values=[1, 2, 3], variances=[0.1, 0.2, 0.3], unit='counts')
49
print(uncertain_data)
50
51
# Reduction operations
52
total = sc.sum(uncertain_data) # Uncertainty propagation is automatic
53
print(total)
54
55
# Binning operations for histogram creation
56
events = sc.data.table_xyz(1000) # Generate sample event data
57
binned = events.bin(x=10) # Bin into 10 x-bins
58
histogram = sc.hist(events, x=10) # Create histogram directly
59
```
60
61
## Architecture
62
63
Scipp's architecture is built around labeled, multi-dimensional arrays with integrated metadata:
64
65
- **Variable**: Core data structure holding multi-dimensional arrays with units, variances, and dimensions
66
- **DataArray**: Variable with associated coordinate variables and masks for enhanced data organization
67
- **Dataset**: Dictionary-like container for multiple related DataArrays sharing coordinate systems
68
- **DataGroup**: Hierarchical container for complex, nested data structures
69
- **Units**: Physical unit system with automatic propagation through all operations
70
- **Binning**: Advanced binning and histogramming with support for irregular bins and event data
71
72
This design enables scientific workflows with automatic unit checking, uncertainty propagation, and metadata preservation throughout all operations.
73
74
## Capabilities
75
76
### Core Data Structures
77
78
Foundation classes for multi-dimensional labeled arrays with physical units, uncertainty propagation, and comprehensive metadata handling.
79
80
```python { .api }
81
class Variable:
82
"""Multi-dimensional array with labeled dimensions, units, and optional variances"""
83
84
class DataArray:
85
"""Variable with coordinates and masks for enhanced data organization"""
86
87
class Dataset:
88
"""Dictionary-like container for multiple DataArrays"""
89
90
class DataGroup:
91
"""Hierarchical container for nested data structures"""
92
93
class Unit:
94
"""Physical unit with arithmetic operations"""
95
96
class DType:
97
"""Data type enumeration for scipp arrays"""
98
```
99
100
[Core Data Structures](./core-data-structures.md)
101
102
### Array Creation and Manipulation
103
104
Functions for creating arrays, scalars, vectors, and structured data with comprehensive shape manipulation and broadcasting capabilities.
105
106
```python { .api }
107
def array(dims, values, *, variances=None, unit=None, dtype=None): ...
108
def scalar(value, *, variance=None, unit=None, dtype=None): ...
109
def zeros(dims, shape, *, unit=None, dtype=None): ...
110
def ones(dims, shape, *, unit=None, dtype=None): ...
111
def linspace(dim, start, stop, num, *, unit=None, dtype=None): ...
112
def arange(dim, start, stop=None, step=None, *, unit=None, dtype=None): ...
113
```
114
115
[Array Creation](./array-creation.md)
116
117
### Mathematical Operations
118
119
Comprehensive mathematical functions including arithmetic, trigonometric, logarithmic, and specialized operations with automatic unit propagation and uncertainty handling.
120
121
```python { .api }
122
def add(x, y): ...
123
def multiply(x, y): ...
124
def sin(x): ...
125
def cos(x): ...
126
def exp(x): ...
127
def log(x): ...
128
def sqrt(x): ...
129
def abs(x): ...
130
```
131
132
[Mathematical Operations](./mathematical-operations.md)
133
134
### Reduction and Statistical Operations
135
136
Statistical reduction functions along dimensions with comprehensive NaN handling and uncertainty propagation.
137
138
```python { .api }
139
def sum(x, dim=None): ...
140
def mean(x, dim=None): ...
141
def std(x, dim=None): ...
142
def var(x, dim=None): ...
143
def min(x, dim=None): ...
144
def max(x, dim=None): ...
145
def median(x, dim=None): ...
146
```
147
148
[Reduction Operations](./reduction-operations.md)
149
150
### Shape Operations and Broadcasting
151
152
Functions for manipulating array shapes, dimensions, and broadcasting with full metadata preservation.
153
154
```python { .api }
155
def broadcast(x, *, dims=None, shape=None): ...
156
def transpose(x, dims=None): ...
157
def squeeze(x, dim=None): ...
158
def flatten(x, dims, to): ...
159
def fold(x, dim, dims, shape): ...
160
def concat(x, dim): ...
161
```
162
163
[Shape Operations](./shape-operations.md)
164
165
### Binning and Histogramming
166
167
Advanced binning operations for event data, histogram creation, and data grouping with support for irregular bins and multi-dimensional binning.
168
169
```python { .api }
170
def bin(x, /, **edges): ...
171
def hist(x, /, **edges): ...
172
def rebin(x, **edges): ...
173
def group(x, /, **groups): ...
174
def groupby(x, group, *, dim=None): ...
175
```
176
177
[Binning and Histogramming](./binning-histogramming.md)
178
179
### Physical Units System
180
181
Comprehensive physical unit system with predefined units, unit conversion, and alias management.
182
183
```python { .api }
184
class Unit:
185
def __init__(self, unit_string): ...
186
187
def to_unit(x, unit): ...
188
189
# Predefined units
190
dimensionless: Unit
191
m: Unit # meter
192
kg: Unit # kilogram
193
s: Unit # second
194
K: Unit # kelvin
195
rad: Unit # radian
196
deg: Unit # degree
197
```
198
199
[Units System](./units-system.md)
200
201
### Spatial Transformations
202
203
Vector operations, coordinate transformations, rotations, translations, and spatial geometry functions.
204
205
```python { .api }
206
def vector(value, *, unit=None): ...
207
def vectors(dims, values, *, unit=None): ...
208
def rotation(*, value): ...
209
def translation(*, value, unit=None): ...
210
def linear_transform(*, value, unit=None): ...
211
def as_vectors(x, y, z): ...
212
```
213
214
[Spatial Operations](./spatial-operations.md)
215
216
### Data Input/Output
217
218
File I/O operations supporting HDF5 and CSV formats with full metadata preservation.
219
220
```python { .api }
221
def load_hdf5(filename): ...
222
def save_hdf5(data, filename): ...
223
def load_csv(filename, **kwargs): ...
224
```
225
226
[Input/Output](./input-output.md)
227
228
### Visualization and Display
229
230
Data visualization functions for interactive plotting, HTML representation, and table display.
231
232
```python { .api }
233
def plot(data, **kwargs): ...
234
def show(data): ...
235
def table(data): ...
236
def make_html(data): ...
237
def make_svg(data): ...
238
```
239
240
[Visualization](./visualization.md)
241
242
### SciPy Integration
243
244
Wrappers for SciPy functionality including optimization, interpolation, signal processing, and image processing.
245
246
```python { .api }
247
# scipy.optimize
248
def curve_fit(f, data, **kwargs): ...
249
250
# scipy.interpolate
251
def interp1d(data, dim, **kwargs): ...
252
253
# scipy.ndimage
254
def gaussian_filter(x, *, sigma, **kwargs): ...
255
```
256
257
[SciPy Integration](./scipy-integration.md)
258
259
### Coordinate Systems
260
261
Coordinate transformation functions and graph-based coordinate system management.
262
263
```python { .api }
264
def transform_coords(x, targets, **kwargs): ...
265
def show_graph(coords): ...
266
```
267
268
[Coordinate Systems](./coordinate-systems.md)
269
270
### Testing and Assertions
271
272
Testing utilities for scientific data validation and comparison.
273
274
```python { .api }
275
def assert_identical(a, b): ...
276
def assert_allclose(a, b, **kwargs): ...
277
```
278
279
[Testing Utilities](./testing-utilities.md)
280
281
## Types
282
283
```python { .api }
284
from typing import Union, Optional, Dict, List, Any, Sequence, Mapping
285
from numpy.typing import ArrayLike
286
287
# Core types
288
VariableLike = Union[Variable, DataArray]
289
DimMapping = Dict[str, Union[int, slice, Variable]]
290
Dims = Union[str, Sequence[str]]
291
Shape = Union[int, Sequence[int]]
292
Values = Union[ArrayLike, Sequence]
293
294
# Exception types
295
class BinEdgeError(Exception): ...
296
class BinnedDataError(Exception): ...
297
class CoordError(Exception): ...
298
class DataArrayError(Exception): ...
299
class DatasetError(Exception): ...
300
class DimensionError(Exception): ...
301
class DTypeError(Exception): ...
302
class UnitError(Exception): ...
303
class VariableError(Exception): ...
304
class VariancesError(Exception): ...
305
```