0
# Utilities
1
2
HDMF provides a comprehensive set of utilities for parameter validation, argument handling, type checking, and data manipulation. These utilities form the backbone of HDMF's robust type system and are essential for developing extensions and working with HDMF data structures.
3
4
## Capabilities
5
6
### Documentation and Validation Decorator
7
8
The core decorator system for parameter validation and documentation generation throughout HDMF.
9
10
```python { .api }
11
def docval(*args, **kwargs):
12
"""
13
Decorator for documenting and enforcing method parameter types and constraints.
14
15
Args:
16
*args: Parameter specifications as dictionaries
17
**kwargs: Additional validation options
18
19
Parameter specification format:
20
{
21
'name': str, # Parameter name
22
'type': type/tuple, # Expected type(s)
23
'doc': str, # Documentation string
24
'default': any, # Default value (optional)
25
'shape': tuple, # Expected array shape (optional)
26
'allow_none': bool # Allow None values (optional)
27
}
28
29
Returns:
30
Decorated function with validation
31
"""
32
33
def getargs(arg_names, kwargs: dict):
34
"""
35
Retrieve specified arguments from dictionary with validation.
36
37
Args:
38
arg_names: String or list of argument names to retrieve
39
kwargs: Dictionary containing arguments
40
41
Returns:
42
Single value if arg_names is string, tuple if list
43
44
Raises:
45
TypeError: If required arguments are missing
46
"""
47
48
def popargs(arg_names, kwargs: dict):
49
"""
50
Retrieve and remove specified arguments from dictionary.
51
52
Args:
53
arg_names: String or list of argument names to retrieve
54
kwargs: Dictionary to retrieve and modify
55
56
Returns:
57
Single value if arg_names is string, tuple if list
58
"""
59
60
def popargs_to_dict(arg_names: list, kwargs: dict) -> dict:
61
"""
62
Extract multiple arguments to a new dictionary.
63
64
Args:
65
arg_names: List of argument names to extract
66
kwargs: Source dictionary
67
68
Returns:
69
Dictionary containing extracted arguments
70
"""
71
72
def get_docval(func) -> tuple:
73
"""
74
Get docval arguments for a function.
75
76
Args:
77
func: Function to inspect
78
79
Returns:
80
Tuple of docval argument specifications
81
"""
82
```
83
84
### Type Checking and Validation
85
86
Comprehensive type checking utilities for validating data types, shapes, and constraints.
87
88
```python { .api }
89
def check_type(value, type_, name: str = None) -> bool:
90
"""
91
Check if value matches expected type with detailed error reporting.
92
93
Args:
94
value: Value to check
95
type_: Expected type or tuple of types
96
name: Name for error messages
97
98
Returns:
99
True if type matches
100
101
Raises:
102
TypeError: If type doesn't match
103
"""
104
105
class ExtenderMeta(type):
106
"""
107
Metaclass for extending base class initialization with additional functionality.
108
109
Enables automatic method extension and initialization customization.
110
"""
111
112
def __new__(cls, name, bases, namespace, **kwargs):
113
"""Create new class with extended functionality."""
114
115
class LabelledDict(dict):
116
"""
117
Dictionary wrapper with attribute-based querying and labeling capabilities.
118
119
Provides enhanced dictionary functionality with label-based access patterns.
120
"""
121
122
def __init__(self, label: str, key_class=None, **kwargs):
123
"""
124
Initialize labelled dictionary.
125
126
Args:
127
label: Label for the dictionary
128
key_class: Expected class for dictionary keys
129
"""
130
131
def __getattribute__(self, item):
132
"""Enable attribute-based access to dictionary values."""
133
134
class StrDataset:
135
"""
136
String dataset wrapper for HDF5 compatibility.
137
138
Handles string encoding/decoding for HDF5 storage backends.
139
"""
140
141
def __init__(self, data, **kwargs):
142
"""
143
Initialize string dataset.
144
145
Args:
146
data: String data to wrap
147
"""
148
```
149
150
### Data Utilities
151
152
Utilities for working with array data, shapes, and data type conversion.
153
154
```python { .api }
155
def get_data_shape(data) -> tuple:
156
"""
157
Determine shape of array-like data including ragged arrays.
158
159
Args:
160
data: Array-like data object
161
162
Returns:
163
Tuple representing data shape
164
"""
165
166
def pystr(s) -> str:
167
"""
168
Convert bytes to Python string with proper encoding handling.
169
170
Args:
171
s: String or bytes to convert
172
173
Returns:
174
Python string
175
"""
176
177
def to_uint_array(data) -> np.ndarray:
178
"""
179
Convert array to unsigned integers with validation.
180
181
Args:
182
data: Array-like data to convert
183
184
Returns:
185
NumPy array with unsigned integer dtype
186
187
Raises:
188
ValueError: If conversion would result in data loss
189
"""
190
191
def is_ragged(data) -> bool:
192
"""
193
Test if array-like data is ragged (has inconsistent dimensions).
194
195
Args:
196
data: Array-like data to test
197
198
Returns:
199
True if data is ragged, False otherwise
200
"""
201
202
def get_basic_array_info(data) -> dict:
203
"""
204
Get basic information about array-like data.
205
206
Args:
207
data: Array-like data to analyze
208
209
Returns:
210
Dictionary with keys: 'shape', 'dtype', 'size', 'is_ragged'
211
"""
212
213
def generate_array_html_repr(data, max_elements: int = 1000) -> str:
214
"""
215
Generate HTML representation of arrays for Jupyter notebooks.
216
217
Args:
218
data: Array data to represent
219
max_elements: Maximum elements to display
220
221
Returns:
222
HTML string representation
223
"""
224
```
225
226
### Version Utilities
227
228
Utilities for version comparison and compatibility checking.
229
230
```python { .api }
231
def is_newer_version(version1: str, version2: str) -> bool:
232
"""
233
Compare version strings to determine if first is newer than second.
234
235
Args:
236
version1: First version string (e.g., '1.2.3')
237
version2: Second version string (e.g., '1.2.0')
238
239
Returns:
240
True if version1 is newer than version2
241
242
Examples:
243
>>> is_newer_version('1.2.3', '1.2.0')
244
True
245
>>> is_newer_version('2.0.0', '1.9.9')
246
True
247
>>> is_newer_version('1.0.0', '1.0.0')
248
False
249
"""
250
```
251
252
### Constants and Enums
253
254
Important constants and enumerations used throughout HDMF for validation and configuration.
255
256
```python { .api }
257
class AllowPositional(Enum):
258
"""
259
Enumeration for controlling positional argument handling in docval.
260
261
Values:
262
- NONE: No positional arguments allowed
263
- SOME: Some positional arguments allowed
264
- ALL: All arguments can be positional
265
"""
266
NONE = 'none'
267
SOME = 'some'
268
ALL = 'all'
269
270
# Type macros for docval validation
271
array_data = 'array_data' # Macro for array-like data types
272
scalar_data = 'scalar_data' # Macro for scalar data types
273
data = 'data' # Generic data macro
274
```
275
276
## Usage Examples
277
278
### Using docval for Parameter Validation
279
280
```python
281
from hdmf.utils import docval, getargs
282
283
class DataProcessor:
284
285
@docval(
286
{'name': 'data', 'type': ('array_data', list), 'doc': 'Input data to process'},
287
{'name': 'method', 'type': str, 'doc': 'Processing method', 'default': 'mean'},
288
{'name': 'axis', 'type': int, 'doc': 'Axis for processing', 'default': 0, 'allow_none': True}
289
)
290
def process(self, **kwargs):
291
data, method, axis = getargs('data', 'method', 'axis', kwargs)
292
293
if method == 'mean':
294
return np.mean(data, axis=axis)
295
elif method == 'sum':
296
return np.sum(data, axis=axis)
297
else:
298
raise ValueError(f"Unknown method: {method}")
299
300
# Usage
301
processor = DataProcessor()
302
result = processor.process(data=[[1, 2], [3, 4]], method='mean', axis=0)
303
```
304
305
### Type Checking and Validation
306
307
```python
308
from hdmf.utils import check_type, is_ragged, get_data_shape
309
import numpy as np
310
311
# Type checking
312
data = np.array([1, 2, 3])
313
check_type(data, np.ndarray, 'data') # Passes
314
315
# Check for ragged arrays
316
regular_array = [[1, 2], [3, 4]]
317
ragged_array = [[1, 2, 3], [4, 5]]
318
319
print(is_ragged(regular_array)) # False
320
print(is_ragged(ragged_array)) # True
321
322
# Get shape information
323
shape = get_data_shape(ragged_array)
324
print(shape) # (2, None) - indicates ragged structure
325
```
326
327
### Working with LabelledDict
328
329
```python
330
from hdmf.utils import LabelledDict
331
332
# Create labelled dictionary
333
config = LabelledDict(label='experiment_config', key_class=str)
334
config['sampling_rate'] = 30000
335
config['duration'] = 3600
336
config['channels'] = ['ch1', 'ch2', 'ch3']
337
338
# Access via attributes (if keys are valid identifiers)
339
print(config.sampling_rate) # 30000
340
print(config.duration) # 3600
341
```
342
343
### Version Comparison
344
345
```python
346
from hdmf.utils import is_newer_version
347
348
# Check software compatibility
349
current_version = '4.1.0'
350
required_version = '4.0.0'
351
352
if is_newer_version(current_version, required_version):
353
print("Version is compatible")
354
else:
355
print("Version upgrade required")
356
```
357
358
### Data Shape Analysis
359
360
```python
361
from hdmf.utils import get_basic_array_info
362
import numpy as np
363
364
# Analyze different data types
365
regular_data = np.random.randn(100, 50)
366
ragged_data = [[1, 2], [3, 4, 5], [6]]
367
368
info1 = get_basic_array_info(regular_data)
369
info2 = get_basic_array_info(ragged_data)
370
371
print("Regular array info:", info1)
372
# {'shape': (100, 50), 'dtype': 'float64', 'size': 5000, 'is_ragged': False}
373
374
print("Ragged array info:", info2)
375
# {'shape': (3, None), 'dtype': 'object', 'size': 3, 'is_ragged': True}
376
```