Tessl Tile for pypi/hdmf@4.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

build-system.md common-data.md containers.md data-utils.md index.md io-backends.md query.md specification.md term-sets.md utils.md validation.md

io-backends.mddocs/

0
# I/O Backends
1

2
HDMF provides a pluggable I/O system supporting multiple storage backends including HDF5 and Zarr. The I/O system handles reading and writing hierarchical data structures with support for compression, chunking, and efficient data access patterns.
3

4
## Capabilities
5

6
### Base I/O Interface
7

8
Abstract base class defining the interface for all HDMF I/O backends.
9

10
```python { .api }
11
class HDMFIO:
12
    """
13
    Abstract base class for HDMF I/O operations.
14
    
15
    Provides the interface contract for all storage backend implementations.
16
    """
17
    
18
    def __init__(self, path: str, mode: str = 'r', **kwargs):
19
        """
20
        Initialize I/O backend.
21
        
22
        Args:
23
            path: Path to the file or storage location
24
            mode: File access mode ('r', 'w', 'a', 'r+')
25
        """
26
    
27
    def write(self, container, **kwargs):
28
        """
29
        Write container to storage backend.
30
        
31
        Args:
32
            container: Container object to write
33
        """
34
    
35
    def read(self, **kwargs):
36
        """
37
        Read data from storage backend.
38
        
39
        Returns:
40
            Container object with loaded data
41
        """
42
    
43
    def close(self):
44
        """Close the I/O backend and release resources."""
45
    
46
    def __enter__(self):
47
        """Context manager entry."""
48
        
49
    def __exit__(self, exc_type, exc_val, exc_tb):
50
        """Context manager exit with cleanup."""
51
```
52

53
### HDF5 I/O Backend
54

55
Primary I/O backend for reading and writing HDF5 files with full HDMF feature support.
56

57
```python { .api }
58
class HDF5IO(HDMFIO):
59
    """
60
    HDF5 I/O backend for reading and writing HDMF data to HDF5 files.
61
    
62
    Supports all HDMF features including hierarchical containers, metadata,
63
    compression, chunking, and cross-platform compatibility.
64
    """
65
    
66
    def __init__(self, path: str, mode: str = 'r', manager=None, **kwargs):
67
        """
68
        Initialize HDF5 I/O.
69
        
70
        Args:
71
            path: Path to HDF5 file
72
            mode: File access mode ('r', 'w', 'a', 'r+')
73
            manager: Build manager for container conversion
74
            **kwargs: Additional HDF5 file options
75
        """
76
    
77
    def write(self, container, **kwargs):
78
        """
79
        Write container to HDF5 file.
80
        
81
        Args:
82
            container: Container object to write
83
            **kwargs: Write options including:
84
                - cache_spec: Whether to cache specification (default: True)
85
                - exhaust_dci: Whether to exhaust data chunk iterators
86
                - link_data: Whether to link external data
87
        """
88
    
89
    def read(self, **kwargs):
90
        """
91
        Read container from HDF5 file.
92
        
93
        Args:
94
            **kwargs: Read options
95
            
96
        Returns:
97
            Container object loaded from file
98
        """
99
    
100
    def export(self, src_io, container, **kwargs):
101
        """
102
        Export container from another I/O source to this HDF5 file.
103
        
104
        Args:
105
            src_io: Source I/O object
106
            container: Container to export
107
        """
108
    
109
    def close(self):
110
        """Close HDF5 file and release resources."""
111
    
112
    @property
113
    def file(self):
114
        """Access to underlying h5py File object."""
115
```
116

117
### HDF5 Data I/O Configuration
118

119
Configuration wrapper for customizing how data is written to HDF5 files.
120

121
```python { .api }
122
class H5DataIO:
123
    """
124
    HDF5 data I/O configuration wrapper for controlling storage options.
125
    
126
    Provides fine-grained control over compression, chunking, filtering,
127
    and other HDF5 dataset creation properties.
128
    """
129
    
130
    def __init__(self, data, **kwargs):
131
        """
132
        Initialize H5DataIO wrapper.
133
        
134
        Args:
135
            data: Data to be written
136
            **kwargs: HDF5 dataset creation options:
137
                - compression: Compression filter ('gzip', 'lzf', 'szip')
138
                - compression_opts: Compression level (0-9 for gzip)
139
                - shuffle: Enable shuffle filter for better compression
140
                - fletcher32: Enable Fletcher32 checksum filter
141
                - chunks: Chunk shape for datasets
142
                - maxshape: Maximum shape for resizable datasets
143
                - fillvalue: Fill value for uninitialized data
144
                - track_times: Track dataset creation/modification times
145
        """
146
    
147
    @property
148
    def data(self):
149
        """Access to wrapped data."""
150
    
151
    @property
152
    def io_settings(self) -> dict:
153
        """Dictionary of I/O settings for this data."""
154
```
155

156
### HDF5 Specification I/O
157

158
Specialized classes for reading and writing HDMF specifications to HDF5 files.
159

160
```python { .api }
161
class H5SpecWriter:
162
    """
163
    Writer for HDMF specifications in HDF5 format.
164
    
165
    Handles storage of namespace and specification information within HDF5 files.
166
    """
167
    
168
    def __init__(self, io: HDF5IO):
169
        """
170
        Initialize specification writer.
171
        
172
        Args:
173
            io: HDF5IO object for file access
174
        """
175
    
176
    def write_spec(self, spec_catalog, spec_namespace):
177
        """
178
        Write specification catalog and namespace to HDF5 file.
179
        
180
        Args:
181
            spec_catalog: Specification catalog to write
182
            spec_namespace: Namespace information
183
        """
184

185
class H5SpecReader:
186
    """
187
    Reader for HDMF specifications from HDF5 format.
188
    
189
    Loads namespace and specification information from HDF5 files.
190
    """
191
    
192
    def __init__(self, io: HDF5IO):
193
        """
194
        Initialize specification reader.
195
        
196
        Args:
197
            io: HDF5IO object for file access
198
        """
199
    
200
    def read_spec(self) -> tuple:
201
        """
202
        Read specification from HDF5 file.
203
        
204
        Returns:
205
            Tuple of (spec_catalog, spec_namespace)
206
        """
207
```
208

209
### HDF5 Utilities and Tools
210

211
Utility functions and tools for working with HDF5 files and datasets.
212

213
```python { .api }
214
class H5Dataset:
215
    """
216
    Wrapper for HDF5 datasets providing enhanced functionality.
217
    
218
    Adds HDMF-specific features to h5py dataset objects including
219
    lazy loading, data transformation, and metadata handling.
220
    """
221
    
222
    def __init__(self, dataset, io: HDF5IO, **kwargs):
223
        """
224
        Initialize H5Dataset wrapper.
225
        
226
        Args:
227
            dataset: h5py dataset object
228
            io: Parent HDF5IO object
229
        """
230
    
231
    def __getitem__(self, key):
232
        """Get data slice from dataset."""
233
    
234
    def __setitem__(self, key, value):
235
        """Set data slice in dataset."""
236
    
237
    @property
238
    def shape(self) -> tuple:
239
        """Shape of the dataset."""
240
    
241
    @property
242
    def dtype(self):
243
        """Data type of the dataset."""
244
    
245
    @property
246
    def size(self) -> int:
247
        """Total number of elements in dataset."""
248

249
# HDF5 utility functions
250
def get_h5_version() -> str:
251
    """
252
    Get HDF5 library version.
253
    
254
    Returns:
255
        HDF5 version string
256
    """
257

258
def check_h5_version(min_version: str = None) -> bool:
259
    """
260
    Check if HDF5 version meets minimum requirements.
261
    
262
    Args:
263
        min_version: Minimum required version
264
        
265
    Returns:
266
        True if version is sufficient
267
    """
268
```
269

270
## Usage Examples
271

272
### Basic HDF5 I/O Operations
273

274
```python
275
from hdmf.backends.hdf5 import HDF5IO, H5DataIO
276
from hdmf import Container, Data
277
import numpy as np
278

279
# Create sample data
280
data_array = np.random.randn(1000, 100)
281
data_container = Data(name='neural_data', data=data_array)
282

283
container = Container(name='experiment')
284
container.add_child(data_container)
285

286
# Write to HDF5 file
287
with HDF5IO('experiment.h5', mode='w') as io:
288
    io.write(container)
289

290
# Read from HDF5 file
291
with HDF5IO('experiment.h5', mode='r') as io:
292
    read_container = io.read()
293
    print(f"Container: {read_container.name}")
294
    print(f"Data shape: {read_container.neural_data.shape}")
295
```
296

297
### Advanced HDF5 Data Configuration
298

299
```python
300
from hdmf.backends.hdf5 import H5DataIO
301
import numpy as np
302

303
# Create large dataset with compression
304
large_data = np.random.randn(10000, 1000)
305

306
# Configure compression and chunking
307
compressed_data = H5DataIO(
308
    data=large_data,
309
    compression='gzip',
310
    compression_opts=9,        # Maximum compression
311
    shuffle=True,              # Better compression for numeric data
312
    fletcher32=True,           # Checksums for data integrity
313
    chunks=(1000, 100),        # Chunk size for efficient access
314
    maxshape=(None, 1000)      # Allow resizing along first dimension
315
)
316

317
data_container = Data(name='compressed_data', data=compressed_data)
318

319
# Write with advanced options
320
with HDF5IO('compressed_experiment.h5', mode='w') as io:
321
    io.write(container, cache_spec=True, exhaust_dci=False)
322
```
323

324
### Working with External Data Links
325

326
```python
327
from hdmf.backends.hdf5 import HDF5IO
328
from hdmf import Data
329

330
# Create external data reference
331
external_data = H5DataIO(
332
    data='path/to/external/data.h5',
333
    link_data=True  # Link instead of copying
334
)
335

336
data_container = Data(name='external_data', data=external_data)
337

338
# Write with external links
339
with HDF5IO('main_file.h5', mode='w') as io:
340
    io.write(container, link_data=True)
341
```
342

343
### Reading Subsets of Large Datasets
344

345
```python
346
from hdmf.backends.hdf5 import HDF5IO
347

348
# Open file in read mode
349
with HDF5IO('large_experiment.h5', mode='r') as io:
350
    container = io.read()
351
    
352
    # Access dataset without loading all data
353
    dataset = container.neural_data.data
354
    
355
    # Read specific slices
356
    first_100_samples = dataset[:100, :]
357
    specific_channels = dataset[:, [0, 5, 10]]
358
    time_window = dataset[1000:2000, :]
359
    
360
    print(f"Dataset shape: {dataset.shape}")
361
    print(f"Slice shape: {first_100_samples.shape}")
362
```
363

364
### Appending Data to Existing Files  
365

366
```python
367
from hdmf.backends.hdf5 import HDF5IO, H5DataIO
368
import numpy as np
369

370
# Initial data with resizable configuration
371
initial_data = H5DataIO(
372
    data=np.random.randn(100, 50),
373
    maxshape=(None, 50),  # Allow growth along first dimension
374
    chunks=(10, 50)
375
)
376

377
data_container = Data(name='growing_data', data=initial_data)
378

379
# Write initial data
380
with HDF5IO('growing_experiment.h5', mode='w') as io:
381
    io.write(container)
382

383
# Append new data
384
with HDF5IO('growing_experiment.h5', mode='a') as io:
385
    container = io.read()
386
    new_data = np.random.randn(50, 50)
387
    
388
    # Append to existing dataset
389
    container.growing_data.append(new_data)
390
    
391
    # Write updated container
392
    io.write(container)
393
```
394

395
### Cross-Platform File Operations
396

397
```python
398
from hdmf.backends.hdf5 import HDF5IO
399
import os
400

401
def process_hdmf_file(input_path: str, output_path: str):
402
    """Process HDMF file across different platforms."""
403
    
404
    # Read from any platform
405
    with HDF5IO(input_path, mode='r') as src_io:
406
        container = src_io.read()
407
        
408
        # Process data
409
        for child in container.children:
410
            if hasattr(child, 'data'):
411
                # Apply processing to data
412
                processed_data = child.data * 1.5
413
                child.data = processed_data
414
    
415
    # Write to new location
416
    with HDF5IO(output_path, mode='w') as dst_io:
417
        dst_io.write(container, cache_spec=True)
418
    
419
    print(f"Processed file written to: {output_path}")
420

421
# Cross-platform usage
422
if os.name == 'nt':  # Windows
423
    input_file = r'C:\data\experiment.h5'
424
    output_file = r'C:\processed\experiment_processed.h5'
425
else:  # Unix-like systems
426
    input_file = '/data/experiment.h5'
427
    output_file = '/processed/experiment_processed.h5'
428

429
process_hdmf_file(input_file, output_file)
430
```

Version

Tile

Files

io-backends.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

io-backends.mddocs/