or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# pyxdf

1

2

A Python library for importing XDF (Extensible Data Format) files commonly used in neuroscience and biosignal research. PyXDF provides a simple interface to load multi-stream time-series data recorded from Lab Streaming Layer (LSL) systems, supporting various data formats and advanced processing features like clock synchronization and jitter removal.

3

4

## Package Information

5

6

- **Package Name**: pyxdf

7

- **Language**: Python

8

- **Installation**: `pip install pyxdf`

9

- **Requirements**: Python 3.9+, numpy>=2.0.2

10

11

## Core Imports

12

13

```python

14

import pyxdf

15

```

16

17

Direct function imports:

18

19

```python

20

from pyxdf import load_xdf, resolve_streams, match_streaminfos

21

```

22

23

Advanced imports (for low-level operations):

24

25

```python

26

from pyxdf.pyxdf import open_xdf, parse_xdf, parse_chunks

27

```

28

29

## Basic Usage

30

31

```python

32

import pyxdf

33

import matplotlib.pyplot as plt

34

import numpy as np

35

36

# Load an XDF file

37

streams, header = pyxdf.load_xdf("recording.xdf")

38

39

# Process each stream

40

for stream in streams:

41

y = stream["time_series"]

42

43

if isinstance(y, list):

44

# String markers - draw vertical lines

45

for timestamp, marker in zip(stream["time_stamps"], y):

46

plt.axvline(x=timestamp)

47

print(f'Marker "{marker[0]}" @ {timestamp:.2f}s')

48

elif isinstance(y, np.ndarray):

49

# Numeric data - plot as lines

50

plt.plot(stream["time_stamps"], y)

51

else:

52

raise RuntimeError("Unknown stream format")

53

54

plt.show()

55

```

56

57

## Architecture

58

59

PyXDF operates on the XDF (Extensible Data Format) specification, processing multi-stream recordings with:

60

61

- **Chunks**: Atomic file units containing headers, samples, clock offsets, or footers

62

- **Streams**: Individual data sources with metadata, timestamps, and time-series data

63

- **Clock Synchronization**: Robust timestamp alignment across streams using ClockOffset chunks

64

- **Jitter Removal**: Regularization of sampling intervals for improved data quality

65

66

The library handles file corruption gracefully, supports compressed files (.xdfz), and provides advanced processing options for research applications requiring high temporal precision.

67

68

## Capabilities

69

70

### XDF File Loading

71

72

Core functionality for importing XDF files with comprehensive data processing, stream selection, and timing corrections.

73

74

```python { .api }

75

def load_xdf(

76

filename,

77

select_streams=None,

78

*,

79

on_chunk=None,

80

synchronize_clocks=True,

81

handle_clock_resets=True,

82

dejitter_timestamps=True,

83

jitter_break_threshold_seconds=1,

84

jitter_break_threshold_samples=500,

85

clock_reset_threshold_seconds=5,

86

clock_reset_threshold_stds=5,

87

clock_reset_threshold_offset_seconds=1,

88

clock_reset_threshold_offset_stds=10,

89

winsor_threshold=0.0001,

90

verbose=None,

91

):

92

"""

93

Import an XDF file with optional stream selection and processing.

94

95

Args:

96

filename (str): Path to XDF file (*.xdf or *.xdfz)

97

select_streams (int | list[int] | list[dict] | None): Stream selection criteria

98

on_chunk (callable, optional): Callback function for chunk processing

99

synchronize_clocks (bool): Enable clock synchronization (default: True)

100

handle_clock_resets (bool): Handle computer restarts during recording (default: True)

101

dejitter_timestamps (bool): Perform jitter removal for regular streams (default: True)

102

jitter_break_threshold_seconds (float): Break detection threshold in seconds (default: 1)

103

jitter_break_threshold_samples (int): Break detection threshold in samples (default: 500)

104

clock_reset_threshold_seconds (float): Clock reset detection threshold (default: 5)

105

clock_reset_threshold_stds (float): Reset detection in standard deviations (default: 5)

106

clock_reset_threshold_offset_seconds (float): Offset threshold for resets (default: 1)

107

clock_reset_threshold_offset_stds (float): Offset threshold in stds (default: 10)

108

winsor_threshold (float): Robust fitting threshold (default: 0.0001)

109

verbose (bool | None): Logging level control

110

111

Returns:

112

tuple[list[dict], dict]: (streams, fileheader)

113

- streams: List of stream dictionaries

114

- fileheader: File header metadata

115

"""

116

```

117

118

#### Stream Data Structure

119

120

Each stream in the returned list contains:

121

122

```python { .api }

123

# Stream dictionary structure

124

stream = {

125

"time_series": Union[np.ndarray, list], # Channel x Sample data or string markers

126

"time_stamps": np.ndarray, # Sample timestamps (synchronized)

127

"info": { # Stream metadata

128

"name": list[str], # Stream name

129

"type": list[str], # Content type (EEG, Events, etc.)

130

"channel_count": list[str], # Number of channels

131

"channel_format": list[str], # Data format (int8, float32, etc.)

132

"nominal_srate": list[str], # Declared sampling rate

133

"effective_srate": float, # Measured sampling rate

134

"stream_id": int, # Unique stream identifier

135

"segments": list[tuple[int, int]], # Data break segments (start, end)

136

"desc": dict, # Domain-specific metadata

137

},

138

"clock_times": list[float], # Clock measurement times

139

"clock_values": list[float], # Clock offset values

140

}

141

```

142

143

#### Supported Data Formats

144

145

- **Numeric**: `int8`, `int16`, `int32`, `int64`, `float32`, `double64`

146

- **String**: Marker and event data as string arrays

147

- **File Formats**: Uncompressed (`.xdf`) and gzip-compressed (`.xdfz`, `.xdf.gz`)

148

149

### Stream Discovery and Selection

150

151

Utilities for discovering streams in XDF files and selecting streams based on criteria.

152

153

```python { .api }

154

def resolve_streams(fname):

155

"""

156

Resolve streams in given XDF file without loading data.

157

158

Args:

159

fname (str): Path to XDF file

160

161

Returns:

162

list[dict]: Stream information dictionaries with metadata

163

"""

164

165

def match_streaminfos(stream_infos, parameters):

166

"""

167

Find stream IDs matching specified criteria.

168

169

Args:

170

stream_infos (list[dict]): Stream information from resolve_streams

171

parameters (list[dict]): Matching criteria as key-value pairs

172

173

Returns:

174

list[int]: Stream IDs matching all criteria

175

176

Examples:

177

# Match streams by name

178

match_streaminfos(infos, [{"name": "EEG"}])

179

180

# Match by type and name

181

match_streaminfos(infos, [{"type": "EEG", "name": "ActiChamp"}])

182

183

# Match multiple criteria (OR logic)

184

match_streaminfos(infos, [{"type": "EEG"}, {"name": "Markers"}])

185

"""

186

```

187

188

### Low-Level File Operations

189

190

Advanced utilities for direct XDF file handling and chunk-level processing.

191

192

```python { .api }

193

def open_xdf(file):

194

"""

195

Open XDF file for reading with format validation.

196

197

Args:

198

file (str | pathlib.Path | io.RawIOBase): File path or opened binary file handle

199

200

Returns:

201

io.BufferedReader | gzip.GzipFile: Opened file handle positioned after magic bytes

202

203

Raises:

204

IOError: If file is not a valid XDF file (missing XDF: magic bytes)

205

ValueError: If file handle is opened in text mode

206

Exception: If file does not exist

207

"""

208

209

def parse_xdf(fname):

210

"""

211

Parse and return all chunks from an XDF file without processing.

212

213

Args:

214

fname (str): Path to XDF file

215

216

Returns:

217

list[dict]: Raw chunks containing headers, samples, and metadata

218

"""

219

220

def parse_chunks(chunks):

221

"""

222

Extract stream information from parsed XDF chunks.

223

224

Args:

225

chunks (list[dict]): Raw chunks from parse_xdf

226

227

Returns:

228

list[dict]: Stream metadata dictionaries suitable for resolve_streams

229

"""

230

```

231

232

#### Stream Selection Examples

233

234

```python

235

# Load specific stream by ID

236

streams, _ = pyxdf.load_xdf("file.xdf", select_streams=5)

237

238

# Load multiple streams by ID

239

streams, _ = pyxdf.load_xdf("file.xdf", select_streams=[1, 3, 5])

240

241

# Load streams by criteria

242

streams, _ = pyxdf.load_xdf("file.xdf", select_streams=[{"type": "EEG"}])

243

244

# Load streams matching name and type

245

criteria = [{"type": "EEG", "name": "BrainAmp"}]

246

streams, _ = pyxdf.load_xdf("file.xdf", select_streams=criteria)

247

```

248

249

### Command Line Tools

250

251

Python modules providing command-line utilities for XDF file inspection and playback.

252

253

#### Metadata Inspection

254

255

```python { .api }

256

# python -m pyxdf.cli.print_metadata -f=/path/to/file.xdf

257

```

258

259

Prints stream metadata including:

260

- Stream count and basic information

261

- Channel counts and data shapes

262

- Sampling rates (nominal and effective)

263

- Stream durations and segment information

264

- Unique identifiers and stream types

265

266

#### LSL Playback (requires pylsl)

267

268

```python { .api }

269

# python -m pyxdf.cli.playback_lsl filename [options]

270

```

271

272

Replays XDF data over Lab Streaming Layer (LSL) streams in real-time with configurable options:

273

274

**Parameters:**

275

- `filename` (str): Path to the XDF file to playback (required)

276

- `--playback_speed` (float): Playback speed multiplier (default: 1.0)

277

- `--loop`: Loop playback of the file continuously (flag, default: False)

278

- `--wait_for_consumer`: Wait for LSL consumer before starting playback (flag, default: False)

279

280

**Features:**

281

- **Real-time playback**: Maintains original timing relationships between streams

282

- **Loop mode**: Continuous playback for prototyping and testing

283

- **Rate control**: Adjustable playback speed for faster or slower replay

284

- **Consumer waiting**: Optional wait for LSL consumers to connect before starting

285

- **Multi-stream support**: Handles all streams in the XDF file simultaneously

286

287

```bash

288

# Basic playback

289

python -m pyxdf.cli.playback_lsl recording.xdf

290

291

# Loop mode with 2x speed

292

python -m pyxdf.cli.playback_lsl recording.xdf --playback_speed 2.0 --loop

293

294

# Wait for consumers before starting

295

python -m pyxdf.cli.playback_lsl recording.xdf --wait_for_consumer

296

297

# Slow motion playback at half speed

298

python -m pyxdf.cli.playback_lsl recording.xdf --playback_speed 0.5

299

```

300

301

## Advanced Usage Examples

302

303

### Custom Stream Processing

304

305

```python

306

def process_chunk(data, timestamps, info, stream_id):

307

"""Custom chunk processing callback."""

308

# Apply real-time filtering, downsampling, etc.

309

if info["type"][0] == "EEG":

310

# Apply notch filter to EEG data

311

filtered_data = apply_notch_filter(data, 60.0) # Remove 60Hz noise

312

return filtered_data, timestamps, info

313

return data, timestamps, info

314

315

# Load with custom processing

316

streams, _ = pyxdf.load_xdf("recording.xdf", on_chunk=process_chunk)

317

```

318

319

### Handling Data Breaks

320

321

```python

322

# Load with custom break detection

323

streams, _ = pyxdf.load_xdf(

324

"recording.xdf",

325

jitter_break_threshold_seconds=0.5, # Detect 500ms breaks

326

jitter_break_threshold_samples=100 # Or 100-sample breaks

327

)

328

329

# Process segments separately

330

for stream in streams:

331

for start_idx, end_idx in stream["info"]["segments"]:

332

segment_data = stream["time_series"][start_idx:end_idx+1]

333

segment_times = stream["time_stamps"][start_idx:end_idx+1]

334

# Process each continuous segment

335

process_segment(segment_data, segment_times)

336

```

337

338

### Clock Synchronization Control

339

340

```python

341

# Disable automatic processing for manual control

342

streams, _ = pyxdf.load_xdf(

343

"recording.xdf",

344

synchronize_clocks=False, # Skip automatic sync

345

dejitter_timestamps=False, # Skip jitter removal

346

verbose=True # Enable debug logging

347

)

348

349

# Access raw clock information

350

for stream in streams:

351

clock_times = stream["clock_times"]

352

clock_values = stream["clock_values"]

353

# Implement custom synchronization

354

custom_sync_timestamps = apply_custom_sync(

355

stream["time_stamps"], clock_times, clock_values

356

)

357

```

358

359

## Error Handling

360

361

PyXDF includes robust error handling for common issues:

362

363

```python

364

try:

365

streams, header = pyxdf.load_xdf("corrupted.xdf")

366

except IOError as e:

367

print(f"File error: {e}")

368

# Raised for invalid XDF files (missing magic bytes) or file access issues

369

except ValueError as e:

370

print(f"Invalid stream selection: {e}")

371

# Raised for malformed select_streams parameter or no matching streams

372

except FileNotFoundError as e:

373

print(f"File not found: {e}")

374

# Raised when XDF file doesn't exist

375

except struct.error as e:

376

print(f"Data corruption detected: {e}")

377

# Raised for corrupted binary data, library attempts recovery

378

except Exception as e:

379

print(f"Parsing error: {e}")

380

# General parsing errors - library attempts to recover and load partial data

381

```

382

383

**Error Recovery Mechanisms:**

384

385

PyXDF automatically handles many failure scenarios:

386

387

- **File corruption**: When binary chunks are corrupted, scans forward to find valid boundary chunks

388

- **Missing streams**: Handles interrupted recordings gracefully, returns available data

389

- **Clock resets**: Detects and corrects for computer restarts during recording using statistical analysis

390

- **Malformed XML**: Skips corrupted metadata elements while preserving time-series data

391

- **Incomplete files**: Loads available data from truncated recordings caused by system failures

392

- **Memory issues**: Processes large files chunk-by-chunk to handle memory constraints

393

- **Data type mismatches**: Handles inconsistent data formats across chunks

394

395

**Specific Error Conditions:**

396

397

- `ValueError("No matching streams found.")` - When `select_streams` criteria match no streams

398

- `ValueError("Argument 'select_streams' must be...")` - Invalid `select_streams` parameter format

399

- `IOError("Invalid XDF file")` - File doesn't start with "XDF:" magic bytes

400

- `ValueError("file has to be opened in binary mode")` - Text mode file handle passed to `open_xdf`

401

- `Exception("file does not exist")` - File path doesn't exist when using `open_xdf`

402

- `EOFError` - Unexpected end of file, handled gracefully with partial data recovery

403

404

## Types

405

406

```python { .api }

407

# Type annotations for main function parameters

408

filename: Union[str, pathlib.Path]

409

select_streams: Union[None, int, list[int], list[dict]]

410

on_chunk: Union[None, Callable[[np.ndarray, np.ndarray, dict, int], tuple[np.ndarray, np.ndarray, dict]]]

411

412

# Stream selection criteria format

413

stream_criteria: dict[str, str] # e.g., {"type": "EEG", "name": "BrainAmp"}

414

415

# Stream info structure from resolve_streams

416

StreamInfo = {

417

"stream_id": int,

418

"name": str,

419

"type": str,

420

"source_id": str,

421

"created_at": str,

422

"uid": str,

423

"session_id": str,

424

"hostname": str,

425

"channel_count": int,

426

"channel_format": str,

427

"nominal_srate": float,

428

}

429

```