Tessl Tile for pypi/tables@3.10.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

arrays-homogeneous-data.md compression-filtering.md file-operations.md index.md querying-indexing.md tables-structured-data.md transactions-undo-redo.md type-system-descriptions.md

index.mddocs/

0
# PyTables
1

2
A comprehensive Python library for managing hierarchical datasets, designed to efficiently cope with extremely large amounts of data. PyTables is built on top of the HDF5 library and NumPy, featuring an object-oriented interface combined with Cython-generated C extensions for performance-critical operations. It provides fast interactive data storage and retrieval capabilities with advanced compression, indexing, and querying features optimized for scientific computing and data analysis workflows.
3

4
## Package Information
5

6
- **Package Name**: tables
7
- **Language**: Python
8
- **Installation**: `pip install tables`
9

10
## Core Imports
11

12
```python
13
import tables
14
```
15

16
Common patterns for file operations:
17

18
```python
19
import tables as tb
20
```
21

22
For specific functionality:
23

24
```python
25
from tables import open_file, File, Group, Table, Array
26
from tables import StringCol, IntCol, FloatCol  # Column types
27
from tables import Filters  # Compression
28
```
29

30
## Basic Usage
31

32
```python
33
import tables as tb
34
import numpy as np
35

36
# Open/create an HDF5 file
37
h5file = tb.open_file("example.h5", mode="w", title="Example File")
38

39
# Create a group for organization
40
group = h5file.create_group("/", "detector", "Detector Information")
41

42
# Create a table with structured data
43
class Particle(tb.IsDescription):
44
    name = tb.StringCol(16)   # 16-character String
45
    idnumber = tb.Int64Col()  # Signed 64-bit integer
46
    ADCcount = tb.UInt16Col() # Unsigned 16-bit integer
47
    TDCcount = tb.UInt8Col()  # Unsigned 8-bit integer
48
    energy = tb.Float32Col()  # 32-bit floating point
49
    timestamp = tb.Time64Col()# Timestamp
50

51
table = h5file.create_table(group, 'readout', Particle, "Readout example")
52

53
# Add data to table
54
particle = table.row
55
for i in range(10):
56
    particle['name'] = f'Particle: {i:6d}'
57
    particle['TDCcount'] = i % 256
58
    particle['ADCcount'] = np.random.randint(0, 65535)
59
    particle['energy'] = np.random.random()
60
    particle['timestamp'] = i * 1.0
61
    particle.append()
62
table.flush()
63

64
# Create arrays for homogeneous data
65
array_c = h5file.create_array(group, 'array_c', np.arange(100), "Array C")
66

67
# Query data
68
results = [row for row in table.where('TDCcount > 5')]
69

70
# Close file
71
h5file.close()
72
```
73

74
## Architecture
75

76
PyTables implements a hierarchical tree structure similar to a filesystem:
77

78
- **File**: Top-level container managing the entire HDF5 file and providing transaction support
79
- **Groups**: Directory-like containers that organize nodes in a hierarchical namespace
80
- **Leaves**: Data-containing nodes including Tables (structured data), Arrays (homogeneous data), and VLArrays (variable-length data)
81
- **Attributes**: Metadata attached to any node for storing small auxiliary information
82
- **Indexes**: B-tree and other indexing structures for fast data retrieval and querying
83

84
The design emphasizes memory efficiency, disk optimization, and seamless integration with NumPy arrays while providing ACID transaction capabilities through undo/redo mechanisms.
85

86
## Capabilities
87

88
### File Operations
89

90
Core file management including opening, creating, copying, and validating PyTables/HDF5 files with comprehensive mode control and optimization options.
91

92
```python { .api }
93
def open_file(filename, mode="r", title="", root_uep="/", filters=None, **kwargs): ...
94
def copy_file(srcfilename, dstfilename, overwrite=False, **kwargs): ...
95
def is_hdf5_file(filename): ...
96
def is_pytables_file(filename): ...
97
```
98

99
[File Operations](./file-operations.md)
100

101
### Hierarchical Organization
102

103
Group-based hierarchical organization for structuring datasets in tree-like namespaces with directory-style navigation and node management.
104

105
```python { .api }
106
class Group:
107
    def _f_walknodes(self, classname=None): ...
108
    def _f_list_nodes(self, classname=None): ...
109
    def __contains__(self, name): ...
110
    def __getitem__(self, name): ...
111
```
112

113
[Groups and Navigation](./groups-navigation.md)
114

115
### Structured Data Storage
116

117
Table-based structured data storage with column-oriented access, conditional querying, indexing, and modification capabilities for record-based datasets.
118

119
```python { .api }
120
class Table:
121
    def read(self, start=None, stop=None, step=None, field=None, out=None): ...
122
    def read_where(self, condition, condvars=None, **kwargs): ...
123
    def where(self, condition, condvars=None, start=None, stop=None): ...
124
    def append(self, rows): ...
125
    def modify_column(self, start=None, stop=None, step=None, column=None, value=None): ...
126
```
127

128
[Tables and Structured Data](./tables-structured-data.md)
129

130
### Array Data Storage
131

132
Array-based homogeneous data storage including standard arrays, chunked arrays, enlargeable arrays, and variable-length arrays with NumPy integration.
133

134
```python { .api }
135
class Array:
136
    def read(self, start=None, stop=None, step=None, out=None): ...
137
    def __getitem__(self, key): ...
138
    def __setitem__(self, key, value): ...
139

140
class EArray:
141
    def append(self, sequence): ...
142
    def read(self, start=None, stop=None, step=None, out=None): ...
143
```
144

145
[Arrays and Homogeneous Data](./arrays-homogeneous-data.md)
146

147
### Type System and Descriptions
148

149
Comprehensive type system with Atom types for individual data elements and Column types for table structure definitions, supporting all NumPy data types plus specialized types.
150

151
```python { .api }
152
class IsDescription: ...
153

154
# Atom types
155
class StringAtom: ...
156
class IntAtom: ...
157
class FloatAtom: ...
158
class TimeAtom: ...
159

160
# Column types  
161
class StringCol: ...
162
class IntCol: ...
163
class FloatCol: ...
164
class TimeCol: ...
165
```
166

167
[Type System and Descriptions](./type-system-descriptions.md)
168

169
### Compression and Filtering
170

171
Advanced compression and filtering system supporting multiple algorithms (zlib, blosc, blosc2, bzip2, lzo) with configurable parameters for optimal storage and I/O performance.
172

173
```python { .api }
174
class Filters:
175
    def __init__(self, complevel=0, complib="zlib", shuffle=True, bitshuffle=False, fletcher32=False): ...
176

177
def set_blosc_max_threads(nthreads): ...
178
def set_blosc2_max_threads(nthreads): ...
179
```
180

181
[Compression and Filtering](./compression-filtering.md)
182

183
### Querying and Indexing
184

185
Expression-based querying system with compiled expressions, B-tree indexing, and conditional iteration for efficient data retrieval from large datasets.
186

187
```python { .api }
188
class Expr:
189
    def eval(self): ...
190
    def append(self, expr): ...
191

192
# Table methods
193
def create_index(self, **kwargs): ...
194
def remove_index(self): ...
195
def reindex(self): ...
196
```
197

198
[Querying and Indexing](./querying-indexing.md)
199

200
### Transaction System
201

202
Complete undo/redo transaction system with marks, rollback capabilities, and ACID-compliant operations for data integrity and collaborative workflows.
203

204
```python { .api }
205
class File:
206
    def enable_undo(self, filters=None): ...
207
    def disable_undo(self): ...
208
    def mark(self, name=None): ...
209
    def undo(self, mark=None): ...
210
    def redo(self, mark=None): ...
211
```
212

213
[Transactions and Undo/Redo](./transactions-undo-redo.md)
214

215
## Types
216

217
```python { .api }
218
class File:
219
    """Main PyTables file interface."""
220
    def __init__(self, filename, mode="r", title="", root_uep="/", filters=None, **kwargs): ...
221
    def close(self): ...
222
    def flush(self): ...
223
    def create_group(self, where, name, title="", filters=None, createparents=False): ...
224
    def create_table(self, where, name, description, title="", filters=None, expectedrows=10000, createparents=False, sample=None, byteorder=None, **kwargs): ...
225
    def create_array(self, where, name, object, title="", byteorder=None, createparents=False, sample=None): ...
226

227
class Node:
228
    """Base class for all PyTables nodes."""
229
    def _f_close(self): ...
230
    def _f_copy(self, newparent=None, newname=None, overwrite=False, recursive=False, createparents=False, **kwargs): ...
231
    def _f_move(self, newparent=None, newname=None, overwrite=False, createparents=False): ...
232
    def _f_remove(self): ...
233
    def _f_rename(self, newname): ...
234

235
class IsDescription:
236
    """Base class for table descriptions."""
237
    pass
238

239
class UnImplemented(Leaf):
240
    """
241
    Represents datasets not supported by PyTables in generic HDF5 files.
242
    
243
    Used when PyTables encounters HDF5 datasets with unsupported datatype 
244
    or dataspace combinations. Allows access to metadata and attributes 
245
    but not the actual data.
246
    """
247

248
class Unknown(Leaf):
249
    """
250
    Represents unknown node types in HDF5 files.
251
    
252
    Used as a fallback for HDF5 nodes that cannot be classified 
253
    into any supported PyTables category.
254
    """
255

256
FilterProperties = dict[str, any]
257
"""Dictionary containing filter and compression properties."""
258

259
__version__: str
260
"""PyTables version string."""
261

262
hdf5_version: str
263
"""Underlying HDF5 library version string."""
264

265
class Enum:
266
    """
267
    Enumerated type for defining named value sets.
268
    
269
    Used to create enumerated types where variables can take one of a 
270
    predefined set of named values. Each value has a name and concrete value.
271
    """
272
    def __init__(self, enum_values):
273
        """
274
        Create enumeration from sequence or mapping.
275
        
276
        Parameters:
277
        - enum_values: Sequence of names or mapping of names to values
278
        """
279
```
280

281
## Exceptions
282

283
```python { .api }
284
# Core Exceptions
285
class HDF5ExtError(Exception):
286
    """Errors from the HDF5 library."""
287

288
class ClosedNodeError(ValueError):
289
    """Operations on closed nodes."""
290

291
class ClosedFileError(ValueError):
292
    """Operations on closed files."""
293

294
class FileModeError(ValueError):
295
    """Invalid file mode operations."""
296

297
class NodeError(AttributeError):
298
    """General node-related errors."""
299

300
class NoSuchNodeError(LookupError):
301
    """Access to non-existent nodes."""
302

303
# Specialized Exceptions
304
class UndoRedoError(Exception):
305
    """Undo/redo system errors."""
306

307
class FlavorError(TypeError):
308
    """Data flavor conversion errors."""
309

310
class ChunkError(ValueError):
311
    """Chunking-related errors."""
312

313
class NotChunkedError(ChunkError):
314
    """Operations requiring chunked layout."""
315

316
# Warning Classes
317
class NaturalNameWarning(UserWarning):
318
    """Natural naming convention warnings."""
319

320
class PerformanceWarning(UserWarning):
321
    """Performance-related warnings."""
322

323
class DataTypeWarning(UserWarning):
324
    """Data type compatibility warnings."""
325
```
326

327
## Utility Functions
328

329
```python { .api }
330
def test():
331
    """Run the PyTables test suite."""
332

333
def print_versions():
334
    """Print version information for PyTables and dependencies."""
335

336
def silence_hdf5_messages():
337
    """Suppress HDF5 diagnostic messages."""
338

339
def restrict_flavors(keep=None):
340
    """
341
    Restrict available NumPy data flavors.
342
    
343
    Parameters:
344
    - keep (list): List of flavors to keep available
345
    """
346

347
def get_pytables_version():
348
    """
349
    Get PyTables version string.
350
    
351
    Returns:
352
    str: PyTables version
353
    
354
    Note: Deprecated, use tables.__version__ instead
355
    """
356

357
def get_hdf5_version():
358
    """
359
    Get HDF5 library version string.
360
    
361
    Returns:
362
    str: HDF5 version
363
    
364
    Note: Deprecated, use tables.hdf5_version instead
365
    """
366
```
367

368
## Command-Line Tools
369

370
PyTables provides several command-line utilities for file management and inspection:
371

372
- **ptdump**: Dumps PyTables file contents in human-readable format
373
- **ptrepack**: Repacks PyTables files with optimization and format conversion
374
- **pt2to3**: Migrates PyTables files between format versions  
375
- **pttree**: Displays PyTables file tree structure
376

377
These tools are available after installing PyTables and can be run directly from the command line.

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/