or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

build-system.mdcommon-data.mdcontainers.mddata-utils.mdindex.mdio-backends.mdquery.mdspecification.mdterm-sets.mdutils.mdvalidation.md

io-backends.mddocs/

0

# I/O Backends

1

2

HDMF provides a pluggable I/O system supporting multiple storage backends including HDF5 and Zarr. The I/O system handles reading and writing hierarchical data structures with support for compression, chunking, and efficient data access patterns.

3

4

## Capabilities

5

6

### Base I/O Interface

7

8

Abstract base class defining the interface for all HDMF I/O backends.

9

10

```python { .api }

11

class HDMFIO:

12

"""

13

Abstract base class for HDMF I/O operations.

14

15

Provides the interface contract for all storage backend implementations.

16

"""

17

18

def __init__(self, path: str, mode: str = 'r', **kwargs):

19

"""

20

Initialize I/O backend.

21

22

Args:

23

path: Path to the file or storage location

24

mode: File access mode ('r', 'w', 'a', 'r+')

25

"""

26

27

def write(self, container, **kwargs):

28

"""

29

Write container to storage backend.

30

31

Args:

32

container: Container object to write

33

"""

34

35

def read(self, **kwargs):

36

"""

37

Read data from storage backend.

38

39

Returns:

40

Container object with loaded data

41

"""

42

43

def close(self):

44

"""Close the I/O backend and release resources."""

45

46

def __enter__(self):

47

"""Context manager entry."""

48

49

def __exit__(self, exc_type, exc_val, exc_tb):

50

"""Context manager exit with cleanup."""

51

```

52

53

### HDF5 I/O Backend

54

55

Primary I/O backend for reading and writing HDF5 files with full HDMF feature support.

56

57

```python { .api }

58

class HDF5IO(HDMFIO):

59

"""

60

HDF5 I/O backend for reading and writing HDMF data to HDF5 files.

61

62

Supports all HDMF features including hierarchical containers, metadata,

63

compression, chunking, and cross-platform compatibility.

64

"""

65

66

def __init__(self, path: str, mode: str = 'r', manager=None, **kwargs):

67

"""

68

Initialize HDF5 I/O.

69

70

Args:

71

path: Path to HDF5 file

72

mode: File access mode ('r', 'w', 'a', 'r+')

73

manager: Build manager for container conversion

74

**kwargs: Additional HDF5 file options

75

"""

76

77

def write(self, container, **kwargs):

78

"""

79

Write container to HDF5 file.

80

81

Args:

82

container: Container object to write

83

**kwargs: Write options including:

84

- cache_spec: Whether to cache specification (default: True)

85

- exhaust_dci: Whether to exhaust data chunk iterators

86

- link_data: Whether to link external data

87

"""

88

89

def read(self, **kwargs):

90

"""

91

Read container from HDF5 file.

92

93

Args:

94

**kwargs: Read options

95

96

Returns:

97

Container object loaded from file

98

"""

99

100

def export(self, src_io, container, **kwargs):

101

"""

102

Export container from another I/O source to this HDF5 file.

103

104

Args:

105

src_io: Source I/O object

106

container: Container to export

107

"""

108

109

def close(self):

110

"""Close HDF5 file and release resources."""

111

112

@property

113

def file(self):

114

"""Access to underlying h5py File object."""

115

```

116

117

### HDF5 Data I/O Configuration

118

119

Configuration wrapper for customizing how data is written to HDF5 files.

120

121

```python { .api }

122

class H5DataIO:

123

"""

124

HDF5 data I/O configuration wrapper for controlling storage options.

125

126

Provides fine-grained control over compression, chunking, filtering,

127

and other HDF5 dataset creation properties.

128

"""

129

130

def __init__(self, data, **kwargs):

131

"""

132

Initialize H5DataIO wrapper.

133

134

Args:

135

data: Data to be written

136

**kwargs: HDF5 dataset creation options:

137

- compression: Compression filter ('gzip', 'lzf', 'szip')

138

- compression_opts: Compression level (0-9 for gzip)

139

- shuffle: Enable shuffle filter for better compression

140

- fletcher32: Enable Fletcher32 checksum filter

141

- chunks: Chunk shape for datasets

142

- maxshape: Maximum shape for resizable datasets

143

- fillvalue: Fill value for uninitialized data

144

- track_times: Track dataset creation/modification times

145

"""

146

147

@property

148

def data(self):

149

"""Access to wrapped data."""

150

151

@property

152

def io_settings(self) -> dict:

153

"""Dictionary of I/O settings for this data."""

154

```

155

156

### HDF5 Specification I/O

157

158

Specialized classes for reading and writing HDMF specifications to HDF5 files.

159

160

```python { .api }

161

class H5SpecWriter:

162

"""

163

Writer for HDMF specifications in HDF5 format.

164

165

Handles storage of namespace and specification information within HDF5 files.

166

"""

167

168

def __init__(self, io: HDF5IO):

169

"""

170

Initialize specification writer.

171

172

Args:

173

io: HDF5IO object for file access

174

"""

175

176

def write_spec(self, spec_catalog, spec_namespace):

177

"""

178

Write specification catalog and namespace to HDF5 file.

179

180

Args:

181

spec_catalog: Specification catalog to write

182

spec_namespace: Namespace information

183

"""

184

185

class H5SpecReader:

186

"""

187

Reader for HDMF specifications from HDF5 format.

188

189

Loads namespace and specification information from HDF5 files.

190

"""

191

192

def __init__(self, io: HDF5IO):

193

"""

194

Initialize specification reader.

195

196

Args:

197

io: HDF5IO object for file access

198

"""

199

200

def read_spec(self) -> tuple:

201

"""

202

Read specification from HDF5 file.

203

204

Returns:

205

Tuple of (spec_catalog, spec_namespace)

206

"""

207

```

208

209

### HDF5 Utilities and Tools

210

211

Utility functions and tools for working with HDF5 files and datasets.

212

213

```python { .api }

214

class H5Dataset:

215

"""

216

Wrapper for HDF5 datasets providing enhanced functionality.

217

218

Adds HDMF-specific features to h5py dataset objects including

219

lazy loading, data transformation, and metadata handling.

220

"""

221

222

def __init__(self, dataset, io: HDF5IO, **kwargs):

223

"""

224

Initialize H5Dataset wrapper.

225

226

Args:

227

dataset: h5py dataset object

228

io: Parent HDF5IO object

229

"""

230

231

def __getitem__(self, key):

232

"""Get data slice from dataset."""

233

234

def __setitem__(self, key, value):

235

"""Set data slice in dataset."""

236

237

@property

238

def shape(self) -> tuple:

239

"""Shape of the dataset."""

240

241

@property

242

def dtype(self):

243

"""Data type of the dataset."""

244

245

@property

246

def size(self) -> int:

247

"""Total number of elements in dataset."""

248

249

# HDF5 utility functions

250

def get_h5_version() -> str:

251

"""

252

Get HDF5 library version.

253

254

Returns:

255

HDF5 version string

256

"""

257

258

def check_h5_version(min_version: str = None) -> bool:

259

"""

260

Check if HDF5 version meets minimum requirements.

261

262

Args:

263

min_version: Minimum required version

264

265

Returns:

266

True if version is sufficient

267

"""

268

```

269

270

## Usage Examples

271

272

### Basic HDF5 I/O Operations

273

274

```python

275

from hdmf.backends.hdf5 import HDF5IO, H5DataIO

276

from hdmf import Container, Data

277

import numpy as np

278

279

# Create sample data

280

data_array = np.random.randn(1000, 100)

281

data_container = Data(name='neural_data', data=data_array)

282

283

container = Container(name='experiment')

284

container.add_child(data_container)

285

286

# Write to HDF5 file

287

with HDF5IO('experiment.h5', mode='w') as io:

288

io.write(container)

289

290

# Read from HDF5 file

291

with HDF5IO('experiment.h5', mode='r') as io:

292

read_container = io.read()

293

print(f"Container: {read_container.name}")

294

print(f"Data shape: {read_container.neural_data.shape}")

295

```

296

297

### Advanced HDF5 Data Configuration

298

299

```python

300

from hdmf.backends.hdf5 import H5DataIO

301

import numpy as np

302

303

# Create large dataset with compression

304

large_data = np.random.randn(10000, 1000)

305

306

# Configure compression and chunking

307

compressed_data = H5DataIO(

308

data=large_data,

309

compression='gzip',

310

compression_opts=9, # Maximum compression

311

shuffle=True, # Better compression for numeric data

312

fletcher32=True, # Checksums for data integrity

313

chunks=(1000, 100), # Chunk size for efficient access

314

maxshape=(None, 1000) # Allow resizing along first dimension

315

)

316

317

data_container = Data(name='compressed_data', data=compressed_data)

318

319

# Write with advanced options

320

with HDF5IO('compressed_experiment.h5', mode='w') as io:

321

io.write(container, cache_spec=True, exhaust_dci=False)

322

```

323

324

### Working with External Data Links

325

326

```python

327

from hdmf.backends.hdf5 import HDF5IO

328

from hdmf import Data

329

330

# Create external data reference

331

external_data = H5DataIO(

332

data='path/to/external/data.h5',

333

link_data=True # Link instead of copying

334

)

335

336

data_container = Data(name='external_data', data=external_data)

337

338

# Write with external links

339

with HDF5IO('main_file.h5', mode='w') as io:

340

io.write(container, link_data=True)

341

```

342

343

### Reading Subsets of Large Datasets

344

345

```python

346

from hdmf.backends.hdf5 import HDF5IO

347

348

# Open file in read mode

349

with HDF5IO('large_experiment.h5', mode='r') as io:

350

container = io.read()

351

352

# Access dataset without loading all data

353

dataset = container.neural_data.data

354

355

# Read specific slices

356

first_100_samples = dataset[:100, :]

357

specific_channels = dataset[:, [0, 5, 10]]

358

time_window = dataset[1000:2000, :]

359

360

print(f"Dataset shape: {dataset.shape}")

361

print(f"Slice shape: {first_100_samples.shape}")

362

```

363

364

### Appending Data to Existing Files

365

366

```python

367

from hdmf.backends.hdf5 import HDF5IO, H5DataIO

368

import numpy as np

369

370

# Initial data with resizable configuration

371

initial_data = H5DataIO(

372

data=np.random.randn(100, 50),

373

maxshape=(None, 50), # Allow growth along first dimension

374

chunks=(10, 50)

375

)

376

377

data_container = Data(name='growing_data', data=initial_data)

378

379

# Write initial data

380

with HDF5IO('growing_experiment.h5', mode='w') as io:

381

io.write(container)

382

383

# Append new data

384

with HDF5IO('growing_experiment.h5', mode='a') as io:

385

container = io.read()

386

new_data = np.random.randn(50, 50)

387

388

# Append to existing dataset

389

container.growing_data.append(new_data)

390

391

# Write updated container

392

io.write(container)

393

```

394

395

### Cross-Platform File Operations

396

397

```python

398

from hdmf.backends.hdf5 import HDF5IO

399

import os

400

401

def process_hdmf_file(input_path: str, output_path: str):

402

"""Process HDMF file across different platforms."""

403

404

# Read from any platform

405

with HDF5IO(input_path, mode='r') as src_io:

406

container = src_io.read()

407

408

# Process data

409

for child in container.children:

410

if hasattr(child, 'data'):

411

# Apply processing to data

412

processed_data = child.data * 1.5

413

child.data = processed_data

414

415

# Write to new location

416

with HDF5IO(output_path, mode='w') as dst_io:

417

dst_io.write(container, cache_spec=True)

418

419

print(f"Processed file written to: {output_path}")

420

421

# Cross-platform usage

422

if os.name == 'nt': # Windows

423

input_file = r'C:\data\experiment.h5'

424

output_file = r'C:\processed\experiment_processed.h5'

425

else: # Unix-like systems

426

input_file = '/data/experiment.h5'

427

output_file = '/processed/experiment_processed.h5'

428

429

process_hdmf_file(input_file, output_file)

430

```