0
# pyxdf
1
2
A Python library for importing XDF (Extensible Data Format) files commonly used in neuroscience and biosignal research. PyXDF provides a simple interface to load multi-stream time-series data recorded from Lab Streaming Layer (LSL) systems, supporting various data formats and advanced processing features like clock synchronization and jitter removal.
3
4
## Package Information
5
6
- **Package Name**: pyxdf
7
- **Language**: Python
8
- **Installation**: `pip install pyxdf`
9
- **Requirements**: Python 3.9+, numpy>=2.0.2
10
11
## Core Imports
12
13
```python
14
import pyxdf
15
```
16
17
Direct function imports:
18
19
```python
20
from pyxdf import load_xdf, resolve_streams, match_streaminfos
21
```
22
23
Advanced imports (for low-level operations):
24
25
```python
26
from pyxdf.pyxdf import open_xdf, parse_xdf, parse_chunks
27
```
28
29
## Basic Usage
30
31
```python
32
import pyxdf
33
import matplotlib.pyplot as plt
34
import numpy as np
35
36
# Load an XDF file
37
streams, header = pyxdf.load_xdf("recording.xdf")
38
39
# Process each stream
40
for stream in streams:
41
y = stream["time_series"]
42
43
if isinstance(y, list):
44
# String markers - draw vertical lines
45
for timestamp, marker in zip(stream["time_stamps"], y):
46
plt.axvline(x=timestamp)
47
print(f'Marker "{marker[0]}" @ {timestamp:.2f}s')
48
elif isinstance(y, np.ndarray):
49
# Numeric data - plot as lines
50
plt.plot(stream["time_stamps"], y)
51
else:
52
raise RuntimeError("Unknown stream format")
53
54
plt.show()
55
```
56
57
## Architecture
58
59
PyXDF operates on the XDF (Extensible Data Format) specification, processing multi-stream recordings with:
60
61
- **Chunks**: Atomic file units containing headers, samples, clock offsets, or footers
62
- **Streams**: Individual data sources with metadata, timestamps, and time-series data
63
- **Clock Synchronization**: Robust timestamp alignment across streams using ClockOffset chunks
64
- **Jitter Removal**: Regularization of sampling intervals for improved data quality
65
66
The library handles file corruption gracefully, supports compressed files (.xdfz), and provides advanced processing options for research applications requiring high temporal precision.
67
68
## Capabilities
69
70
### XDF File Loading
71
72
Core functionality for importing XDF files with comprehensive data processing, stream selection, and timing corrections.
73
74
```python { .api }
75
def load_xdf(
76
filename,
77
select_streams=None,
78
*,
79
on_chunk=None,
80
synchronize_clocks=True,
81
handle_clock_resets=True,
82
dejitter_timestamps=True,
83
jitter_break_threshold_seconds=1,
84
jitter_break_threshold_samples=500,
85
clock_reset_threshold_seconds=5,
86
clock_reset_threshold_stds=5,
87
clock_reset_threshold_offset_seconds=1,
88
clock_reset_threshold_offset_stds=10,
89
winsor_threshold=0.0001,
90
verbose=None,
91
):
92
"""
93
Import an XDF file with optional stream selection and processing.
94
95
Args:
96
filename (str): Path to XDF file (*.xdf or *.xdfz)
97
select_streams (int | list[int] | list[dict] | None): Stream selection criteria
98
on_chunk (callable, optional): Callback function for chunk processing
99
synchronize_clocks (bool): Enable clock synchronization (default: True)
100
handle_clock_resets (bool): Handle computer restarts during recording (default: True)
101
dejitter_timestamps (bool): Perform jitter removal for regular streams (default: True)
102
jitter_break_threshold_seconds (float): Break detection threshold in seconds (default: 1)
103
jitter_break_threshold_samples (int): Break detection threshold in samples (default: 500)
104
clock_reset_threshold_seconds (float): Clock reset detection threshold (default: 5)
105
clock_reset_threshold_stds (float): Reset detection in standard deviations (default: 5)
106
clock_reset_threshold_offset_seconds (float): Offset threshold for resets (default: 1)
107
clock_reset_threshold_offset_stds (float): Offset threshold in stds (default: 10)
108
winsor_threshold (float): Robust fitting threshold (default: 0.0001)
109
verbose (bool | None): Logging level control
110
111
Returns:
112
tuple[list[dict], dict]: (streams, fileheader)
113
- streams: List of stream dictionaries
114
- fileheader: File header metadata
115
"""
116
```
117
118
#### Stream Data Structure
119
120
Each stream in the returned list contains:
121
122
```python { .api }
123
# Stream dictionary structure
124
stream = {
125
"time_series": Union[np.ndarray, list], # Channel x Sample data or string markers
126
"time_stamps": np.ndarray, # Sample timestamps (synchronized)
127
"info": { # Stream metadata
128
"name": list[str], # Stream name
129
"type": list[str], # Content type (EEG, Events, etc.)
130
"channel_count": list[str], # Number of channels
131
"channel_format": list[str], # Data format (int8, float32, etc.)
132
"nominal_srate": list[str], # Declared sampling rate
133
"effective_srate": float, # Measured sampling rate
134
"stream_id": int, # Unique stream identifier
135
"segments": list[tuple[int, int]], # Data break segments (start, end)
136
"desc": dict, # Domain-specific metadata
137
},
138
"clock_times": list[float], # Clock measurement times
139
"clock_values": list[float], # Clock offset values
140
}
141
```
142
143
#### Supported Data Formats
144
145
- **Numeric**: `int8`, `int16`, `int32`, `int64`, `float32`, `double64`
146
- **String**: Marker and event data as string arrays
147
- **File Formats**: Uncompressed (`.xdf`) and gzip-compressed (`.xdfz`, `.xdf.gz`)
148
149
### Stream Discovery and Selection
150
151
Utilities for discovering streams in XDF files and selecting streams based on criteria.
152
153
```python { .api }
154
def resolve_streams(fname):
155
"""
156
Resolve streams in given XDF file without loading data.
157
158
Args:
159
fname (str): Path to XDF file
160
161
Returns:
162
list[dict]: Stream information dictionaries with metadata
163
"""
164
165
def match_streaminfos(stream_infos, parameters):
166
"""
167
Find stream IDs matching specified criteria.
168
169
Args:
170
stream_infos (list[dict]): Stream information from resolve_streams
171
parameters (list[dict]): Matching criteria as key-value pairs
172
173
Returns:
174
list[int]: Stream IDs matching all criteria
175
176
Examples:
177
# Match streams by name
178
match_streaminfos(infos, [{"name": "EEG"}])
179
180
# Match by type and name
181
match_streaminfos(infos, [{"type": "EEG", "name": "ActiChamp"}])
182
183
# Match multiple criteria (OR logic)
184
match_streaminfos(infos, [{"type": "EEG"}, {"name": "Markers"}])
185
"""
186
```
187
188
### Low-Level File Operations
189
190
Advanced utilities for direct XDF file handling and chunk-level processing.
191
192
```python { .api }
193
def open_xdf(file):
194
"""
195
Open XDF file for reading with format validation.
196
197
Args:
198
file (str | pathlib.Path | io.RawIOBase): File path or opened binary file handle
199
200
Returns:
201
io.BufferedReader | gzip.GzipFile: Opened file handle positioned after magic bytes
202
203
Raises:
204
IOError: If file is not a valid XDF file (missing XDF: magic bytes)
205
ValueError: If file handle is opened in text mode
206
Exception: If file does not exist
207
"""
208
209
def parse_xdf(fname):
210
"""
211
Parse and return all chunks from an XDF file without processing.
212
213
Args:
214
fname (str): Path to XDF file
215
216
Returns:
217
list[dict]: Raw chunks containing headers, samples, and metadata
218
"""
219
220
def parse_chunks(chunks):
221
"""
222
Extract stream information from parsed XDF chunks.
223
224
Args:
225
chunks (list[dict]): Raw chunks from parse_xdf
226
227
Returns:
228
list[dict]: Stream metadata dictionaries suitable for resolve_streams
229
"""
230
```
231
232
#### Stream Selection Examples
233
234
```python
235
# Load specific stream by ID
236
streams, _ = pyxdf.load_xdf("file.xdf", select_streams=5)
237
238
# Load multiple streams by ID
239
streams, _ = pyxdf.load_xdf("file.xdf", select_streams=[1, 3, 5])
240
241
# Load streams by criteria
242
streams, _ = pyxdf.load_xdf("file.xdf", select_streams=[{"type": "EEG"}])
243
244
# Load streams matching name and type
245
criteria = [{"type": "EEG", "name": "BrainAmp"}]
246
streams, _ = pyxdf.load_xdf("file.xdf", select_streams=criteria)
247
```
248
249
### Command Line Tools
250
251
Python modules providing command-line utilities for XDF file inspection and playback.
252
253
#### Metadata Inspection
254
255
```python { .api }
256
# python -m pyxdf.cli.print_metadata -f=/path/to/file.xdf
257
```
258
259
Prints stream metadata including:
260
- Stream count and basic information
261
- Channel counts and data shapes
262
- Sampling rates (nominal and effective)
263
- Stream durations and segment information
264
- Unique identifiers and stream types
265
266
#### LSL Playback (requires pylsl)
267
268
```python { .api }
269
# python -m pyxdf.cli.playback_lsl filename [options]
270
```
271
272
Replays XDF data over Lab Streaming Layer (LSL) streams in real-time with configurable options:
273
274
**Parameters:**
275
- `filename` (str): Path to the XDF file to playback (required)
276
- `--playback_speed` (float): Playback speed multiplier (default: 1.0)
277
- `--loop`: Loop playback of the file continuously (flag, default: False)
278
- `--wait_for_consumer`: Wait for LSL consumer before starting playback (flag, default: False)
279
280
**Features:**
281
- **Real-time playback**: Maintains original timing relationships between streams
282
- **Loop mode**: Continuous playback for prototyping and testing
283
- **Rate control**: Adjustable playback speed for faster or slower replay
284
- **Consumer waiting**: Optional wait for LSL consumers to connect before starting
285
- **Multi-stream support**: Handles all streams in the XDF file simultaneously
286
287
```bash
288
# Basic playback
289
python -m pyxdf.cli.playback_lsl recording.xdf
290
291
# Loop mode with 2x speed
292
python -m pyxdf.cli.playback_lsl recording.xdf --playback_speed 2.0 --loop
293
294
# Wait for consumers before starting
295
python -m pyxdf.cli.playback_lsl recording.xdf --wait_for_consumer
296
297
# Slow motion playback at half speed
298
python -m pyxdf.cli.playback_lsl recording.xdf --playback_speed 0.5
299
```
300
301
## Advanced Usage Examples
302
303
### Custom Stream Processing
304
305
```python
306
def process_chunk(data, timestamps, info, stream_id):
307
"""Custom chunk processing callback."""
308
# Apply real-time filtering, downsampling, etc.
309
if info["type"][0] == "EEG":
310
# Apply notch filter to EEG data
311
filtered_data = apply_notch_filter(data, 60.0) # Remove 60Hz noise
312
return filtered_data, timestamps, info
313
return data, timestamps, info
314
315
# Load with custom processing
316
streams, _ = pyxdf.load_xdf("recording.xdf", on_chunk=process_chunk)
317
```
318
319
### Handling Data Breaks
320
321
```python
322
# Load with custom break detection
323
streams, _ = pyxdf.load_xdf(
324
"recording.xdf",
325
jitter_break_threshold_seconds=0.5, # Detect 500ms breaks
326
jitter_break_threshold_samples=100 # Or 100-sample breaks
327
)
328
329
# Process segments separately
330
for stream in streams:
331
for start_idx, end_idx in stream["info"]["segments"]:
332
segment_data = stream["time_series"][start_idx:end_idx+1]
333
segment_times = stream["time_stamps"][start_idx:end_idx+1]
334
# Process each continuous segment
335
process_segment(segment_data, segment_times)
336
```
337
338
### Clock Synchronization Control
339
340
```python
341
# Disable automatic processing for manual control
342
streams, _ = pyxdf.load_xdf(
343
"recording.xdf",
344
synchronize_clocks=False, # Skip automatic sync
345
dejitter_timestamps=False, # Skip jitter removal
346
verbose=True # Enable debug logging
347
)
348
349
# Access raw clock information
350
for stream in streams:
351
clock_times = stream["clock_times"]
352
clock_values = stream["clock_values"]
353
# Implement custom synchronization
354
custom_sync_timestamps = apply_custom_sync(
355
stream["time_stamps"], clock_times, clock_values
356
)
357
```
358
359
## Error Handling
360
361
PyXDF includes robust error handling for common issues:
362
363
```python
364
try:
365
streams, header = pyxdf.load_xdf("corrupted.xdf")
366
except IOError as e:
367
print(f"File error: {e}")
368
# Raised for invalid XDF files (missing magic bytes) or file access issues
369
except ValueError as e:
370
print(f"Invalid stream selection: {e}")
371
# Raised for malformed select_streams parameter or no matching streams
372
except FileNotFoundError as e:
373
print(f"File not found: {e}")
374
# Raised when XDF file doesn't exist
375
except struct.error as e:
376
print(f"Data corruption detected: {e}")
377
# Raised for corrupted binary data, library attempts recovery
378
except Exception as e:
379
print(f"Parsing error: {e}")
380
# General parsing errors - library attempts to recover and load partial data
381
```
382
383
**Error Recovery Mechanisms:**
384
385
PyXDF automatically handles many failure scenarios:
386
387
- **File corruption**: When binary chunks are corrupted, scans forward to find valid boundary chunks
388
- **Missing streams**: Handles interrupted recordings gracefully, returns available data
389
- **Clock resets**: Detects and corrects for computer restarts during recording using statistical analysis
390
- **Malformed XML**: Skips corrupted metadata elements while preserving time-series data
391
- **Incomplete files**: Loads available data from truncated recordings caused by system failures
392
- **Memory issues**: Processes large files chunk-by-chunk to handle memory constraints
393
- **Data type mismatches**: Handles inconsistent data formats across chunks
394
395
**Specific Error Conditions:**
396
397
- `ValueError("No matching streams found.")` - When `select_streams` criteria match no streams
398
- `ValueError("Argument 'select_streams' must be...")` - Invalid `select_streams` parameter format
399
- `IOError("Invalid XDF file")` - File doesn't start with "XDF:" magic bytes
400
- `ValueError("file has to be opened in binary mode")` - Text mode file handle passed to `open_xdf`
401
- `Exception("file does not exist")` - File path doesn't exist when using `open_xdf`
402
- `EOFError` - Unexpected end of file, handled gracefully with partial data recovery
403
404
## Types
405
406
```python { .api }
407
# Type annotations for main function parameters
408
filename: Union[str, pathlib.Path]
409
select_streams: Union[None, int, list[int], list[dict]]
410
on_chunk: Union[None, Callable[[np.ndarray, np.ndarray, dict, int], tuple[np.ndarray, np.ndarray, dict]]]
411
412
# Stream selection criteria format
413
stream_criteria: dict[str, str] # e.g., {"type": "EEG", "name": "BrainAmp"}
414
415
# Stream info structure from resolve_streams
416
StreamInfo = {
417
"stream_id": int,
418
"name": str,
419
"type": str,
420
"source_id": str,
421
"created_at": str,
422
"uid": str,
423
"session_id": str,
424
"hostname": str,
425
"channel_count": int,
426
"channel_format": str,
427
"nominal_srate": float,
428
}
429
```