0
# File Operations
1
2
Core file management functionality for PyTables, providing comprehensive file opening, creation, copying, and validation capabilities with extensive configuration options for optimization and data integrity.
3
4
## Capabilities
5
6
### Opening and Creating Files
7
8
Opens existing PyTables/HDF5 files or creates new ones with specified access modes, compression settings, and configuration options.
9
10
```python { .api }
11
def open_file(filename, mode="r", title="", root_uep="/", filters=None, **kwargs):
12
"""
13
Open a PyTables (HDF5) file.
14
15
Parameters:
16
- filename (str): Path to the file
17
- mode (str): File access mode - "r" (read), "w" (write), "a" (append), "r+" (read/write)
18
- title (str): User-defined title for root group
19
- root_uep (str): Root user entry point path (default "/")
20
- filters (Filters): Default compression filters for new nodes
21
- **kwargs: Additional parameters (driver, libver, swmr, etc.)
22
23
Returns:
24
File: PyTables File object
25
"""
26
```
27
28
### File Copying and Optimization
29
30
Copies PyTables files with optional filtering, optimization, and format conversion capabilities.
31
32
```python { .api }
33
def copy_file(srcfilename, dstfilename, overwrite=False, **kwargs):
34
"""
35
Copy a PyTables file, possibly converting between different formats.
36
37
Parameters:
38
- srcfilename (str): Source file path
39
- dstfilename (str): Destination file path
40
- overwrite (bool): Whether to overwrite existing destination file
41
- **kwargs: Additional options (filters, upgrade, etc.)
42
43
Returns:
44
None
45
"""
46
```
47
48
### File Validation
49
50
Tests whether files are valid HDF5 or PyTables files.
51
52
```python { .api }
53
def is_hdf5_file(filename):
54
"""
55
Test if a file is a valid HDF5 file.
56
57
Parameters:
58
- filename (str): Path to file to test
59
60
Returns:
61
bool: True if file is valid HDF5, False otherwise
62
"""
63
64
def is_pytables_file(filename):
65
"""
66
Test if a file is a valid PyTables file.
67
68
Parameters:
69
- filename (str): Path to file to test
70
71
Returns:
72
bool: True if file is valid PyTables, False otherwise
73
"""
74
```
75
76
### Library Version Information
77
78
Retrieves version information for underlying libraries.
79
80
```python { .api }
81
def which_lib_version(name):
82
"""
83
Get version information for libraries used by PyTables.
84
85
Parameters:
86
- name (str): Library name ("hdf5", "blosc", "blosc2", etc.)
87
88
Returns:
89
str: Version string for specified library
90
"""
91
```
92
93
## File Object Methods
94
95
### Core File Operations
96
97
```python { .api }
98
class File:
99
def close(self):
100
"""Close the file and flush all pending data."""
101
102
def flush(self):
103
"""Flush all pending data to disk."""
104
105
def __enter__(self):
106
"""Context manager entry."""
107
108
def __exit__(self, *args):
109
"""Context manager exit with automatic cleanup."""
110
```
111
112
### Node Creation Methods
113
114
```python { .api }
115
class File:
116
def create_group(self, where, name, title="", filters=None, createparents=False):
117
"""
118
Create a new group in the hierarchy.
119
120
Parameters:
121
- where (str or Group): Parent location
122
- name (str): Name for new group
123
- title (str): Descriptive title
124
- filters (Filters): Default filters for child nodes
125
- createparents (bool): Create intermediate groups if needed
126
127
Returns:
128
Group: The created group object
129
"""
130
131
def create_table(self, where, name, description, title="", filters=None, expectedrows=10000, createparents=False, **kwargs):
132
"""
133
Create a new table for structured data.
134
135
Parameters:
136
- where (str or Group): Parent location
137
- name (str): Table name
138
- description (Description or dict): Table structure definition
139
- title (str): Descriptive title
140
- filters (Filters): Compression and filtering options
141
- expectedrows (int): Expected number of rows for optimization
142
- createparents (bool): Create intermediate groups if needed
143
144
Returns:
145
Table: The created table object
146
"""
147
148
def create_array(self, where, name, object, title="", byteorder=None, createparents=False):
149
"""
150
Create a new array for homogeneous data.
151
152
Parameters:
153
- where (str or Group): Parent location
154
- name (str): Array name
155
- object (array-like): Initial data or array shape
156
- title (str): Descriptive title
157
- byteorder (str): Byte order specification
158
- createparents (bool): Create intermediate groups if needed
159
160
Returns:
161
Array: The created array object
162
"""
163
164
def create_carray(self, where, name, atom, shape, title="", filters=None, chunkshape=None, byteorder=None, createparents=False, **kwargs):
165
"""
166
Create a chunked array for large datasets.
167
168
Parameters:
169
- where (str or Group): Parent location
170
- name (str): Array name
171
- atom (Atom): Data type specification
172
- shape (tuple): Array dimensions
173
- title (str): Descriptive title
174
- filters (Filters): Compression options
175
- chunkshape (tuple): Chunk dimensions for optimization
176
- byteorder (str): Byte order specification
177
- createparents (bool): Create intermediate groups if needed
178
179
Returns:
180
CArray: The created chunked array object
181
"""
182
183
def create_earray(self, where, name, atom, shape, title="", filters=None, expectedrows=1000, chunkshape=None, byteorder=None, createparents=False):
184
"""
185
Create an enlargeable array.
186
187
Parameters:
188
- where (str or Group): Parent location
189
- name (str): Array name
190
- atom (Atom): Data type specification
191
- shape (tuple): Initial shape (first dimension can be 0)
192
- title (str): Descriptive title
193
- filters (Filters): Compression options
194
- expectedrows (int): Expected final size for optimization
195
- chunkshape (tuple): Chunk dimensions
196
- byteorder (str): Byte order specification
197
- createparents (bool): Create intermediate groups if needed
198
199
Returns:
200
EArray: The created enlargeable array object
201
"""
202
203
def create_vlarray(self, where, name, atom, title="", filters=None, expectedrows=1000, chunkshape=None, byteorder=None, createparents=False):
204
"""
205
Create a variable-length array.
206
207
Parameters:
208
- where (str or Group): Parent location
209
- name (str): Array name
210
- atom (Atom): Data type for array elements
211
- title (str): Descriptive title
212
- filters (Filters): Compression options
213
- expectedrows (int): Expected number of rows
214
- chunkshape (int): Chunk size
215
- byteorder (str): Byte order specification
216
- createparents (bool): Create intermediate groups if needed
217
218
Returns:
219
VLArray: The created variable-length array object
220
"""
221
```
222
223
### Node Access and Management
224
225
```python { .api }
226
class File:
227
def get_node(self, where, name=None, classname=None):
228
"""
229
Retrieve a node from the hierarchy.
230
231
Parameters:
232
- where (str): Path to node or parent location
233
- name (str): Node name (if where is parent)
234
- classname (str): Expected node class name for validation
235
236
Returns:
237
Node: The retrieved node object
238
"""
239
240
def remove_node(self, where, name=None, recursive=False):
241
"""
242
Remove a node from the hierarchy.
243
244
Parameters:
245
- where (str): Path to node or parent location
246
- name (str): Node name (if where is parent)
247
- recursive (bool): Remove children recursively for Groups
248
"""
249
250
def move_node(self, where, newparent=None, newname=None, name=None, overwrite=False, createparents=False):
251
"""
252
Move a node to a different location in the hierarchy.
253
254
Parameters:
255
- where (str): Current path to node or parent location
256
- newparent (str): New parent location
257
- newname (str): New node name
258
- name (str): Node name (if where is parent)
259
- overwrite (bool): Overwrite existing node at destination
260
- createparents (bool): Create intermediate groups if needed
261
"""
262
263
def copy_node(self, where, newparent=None, newname=None, name=None, overwrite=False, recursive=False, createparents=False, **kwargs):
264
"""
265
Copy a node to a different location.
266
267
Parameters:
268
- where (str): Current path to node or parent location
269
- newparent (str): New parent location
270
- newname (str): New node name
271
- name (str): Node name (if where is parent)
272
- overwrite (bool): Overwrite existing node at destination
273
- recursive (bool): Copy children recursively for Groups
274
- createparents (bool): Create intermediate groups if needed
275
- **kwargs: Additional copy options (filters, etc.)
276
277
Returns:
278
Node: The copied node object
279
"""
280
```
281
282
### Tree Traversal
283
284
```python { .api }
285
class File:
286
def walk_nodes(self, where="/", classname=None):
287
"""
288
Iterate over all nodes in the hierarchy.
289
290
Parameters:
291
- where (str): Starting location for traversal
292
- classname (str): Filter by node class name
293
294
Yields:
295
Node: Each node in the traversal order
296
"""
297
298
def walk_groups(self, where="/"):
299
"""
300
Iterate over all groups in the hierarchy.
301
302
Parameters:
303
- where (str): Starting location for traversal
304
305
Yields:
306
Group: Each group in the traversal order
307
"""
308
```
309
310
## Usage Examples
311
312
### Basic File Operations
313
314
```python
315
import tables as tb
316
import numpy as np
317
318
# Create a new file
319
with tb.open_file("data.h5", mode="w", title="Research Data") as h5file:
320
# Create hierarchical structure
321
group = h5file.create_group("/", "experiment1", "First Experiment")
322
323
# Create different types of data storage
324
table = h5file.create_table(group, "measurements", MyDescription)
325
array = h5file.create_array(group, "raw_data", np.random.random((100, 100)))
326
327
# File automatically closed when context exits
328
329
# Copy file with compression
330
tb.copy_file("data.h5", "compressed_data.h5",
331
filters=tb.Filters(complevel=6, complib="blosc"))
332
333
# Validate files
334
if tb.is_pytables_file("data.h5"):
335
print("Valid PyTables file")
336
```