or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

data-io.mddatasets.mdindex.mdmachine-learning.mdpreprocessing.mdsource-analysis.mdstatistics.mdtime-frequency.mdvisualization.md

datasets.mddocs/

0

# Sample Datasets

1

2

Built-in access to standard neuroimaging datasets for testing, tutorials, and benchmarking. MNE-Python provides easy access to over 20 different datasets covering various experimental paradigms and recording modalities.

3

4

## Capabilities

5

6

### Core Datasets

7

8

Standard datasets used in tutorials and examples throughout the MNE documentation.

9

10

```python { .api }

11

def data_path(path: Optional[str] = None, force_update: bool = False, update_path: bool = True,

12

download: bool = True, accept: bool = False, verbose: Optional[Union[bool, str, int]] = None) -> str:

13

"""

14

Generic dataset path function (pattern used by all datasets).

15

16

Parameters:

17

- path: Custom download path

18

- force_update: Force redownload of data

19

- update_path: Update MNE config with path

20

- download: Download if missing

21

- accept: Accept license terms

22

- verbose: Verbosity level

23

24

Returns:

25

Path to dataset directory

26

"""

27

28

# Sample Dataset - Auditory/Visual Paradigm

29

sample.data_path: Callable[..., str] # Download sample dataset

30

sample.get_version: Callable[[], str] # Get dataset version

31

32

# Somatosensory Dataset

33

somato.data_path: Callable[..., str] # Somatosensory MEG data

34

somato.get_version: Callable[[], str]

35

36

# Multimodal Dataset

37

multimodal.data_path: Callable[..., str] # Multimodal face dataset

38

multimodal.get_version: Callable[[], str]

39

40

# SPM Face Dataset

41

spm_face.data_path: Callable[..., str] # SPM face processing dataset

42

spm_face.get_version: Callable[[], str]

43

```

44

45

### Motor Imagery and BCI Datasets

46

47

Datasets for brain-computer interface research and motor imagery classification.

48

49

```python { .api }

50

# EEG Motor Movement/Imagery Dataset

51

eegbci.data_path: Callable[..., str]

52

eegbci.get_version: Callable[[], str]

53

54

def load_data(subject: int, runs: Union[int, List[int]], path: Optional[str] = None,

55

force_update: bool = False, update_path: bool = True,

56

base_url: str = 'https://physionet.org/files/eegmmidb/',

57

verbose: Optional[Union[bool, str, int]] = None) -> List[str]:

58

"""

59

Load EEGBCI dataset files.

60

61

Parameters:

62

- subject: Subject number (1-109)

63

- runs: Run number(s) to load

64

- path: Download path

65

- force_update: Force redownload

66

- update_path: Update MNE config

67

- base_url: Base download URL

68

- verbose: Verbosity level

69

70

Returns:

71

List of paths to downloaded files

72

"""

73

74

# SSVEP Dataset

75

ssvep.data_path: Callable[..., str] # Steady-state visual evoked potentials

76

ssvep.get_version: Callable[[], str]

77

78

def load_data(path: Optional[str] = None, force_update: bool = False,

79

update_path: bool = True, verbose: Optional[Union[bool, str, int]] = None) -> Dict:

80

"""

81

Load SSVEP dataset.

82

83

Returns:

84

Dictionary with loaded epochs and metadata

85

"""

86

```

87

88

### Sleep and Physiology Datasets

89

90

Datasets for sleep research and physiological signal analysis.

91

92

```python { .api }

93

# Sleep Physiology Dataset

94

sleep_physionet.data_path: Callable[..., str]

95

sleep_physionet.get_version: Callable[[], str]

96

97

def age_group_averages(path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> List[str]:

98

"""

99

Load age group average data.

100

101

Parameters:

102

- path: Dataset path

103

- verbose: Verbosity level

104

105

Returns:

106

List of paths to age group files

107

"""

108

109

def temazepam_effects(path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> List[str]:

110

"""

111

Load temazepam effects data.

112

113

Returns:

114

List of paths to temazepam study files

115

"""

116

```

117

118

### Specialized Neuroimaging Datasets

119

120

Datasets for specific analysis methods and experimental paradigms.

121

122

```python { .api }

123

# High-Frequency SEF Dataset

124

hf_sef.data_path: Callable[..., str] # High-frequency somatosensory evoked fields

125

hf_sef.get_version: Callable[[], str]

126

127

# Epilepsy ECoG Dataset

128

epilepsy_ecog.data_path: Callable[..., str] # Intracranial EEG epilepsy data

129

epilepsy_ecog.get_version: Callable[[], str]

130

131

# fNIRS Motor Task Dataset

132

fnirs_motor.data_path: Callable[..., str] # Functional near-infrared spectroscopy

133

fnirs_motor.get_version: Callable[[], str]

134

135

# OPM Dataset

136

opm.data_path: Callable[..., str] # Optically pumped magnetometer data

137

opm.get_version: Callable[[], str]

138

139

# Visual Categorization Dataset

140

visual_92_categories.data_path: Callable[..., str] # Visual object categorization

141

visual_92_categories.get_version: Callable[[], str]

142

143

def load_data(path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> Tuple[ArrayLike, ArrayLike]:

144

"""

145

Load visual categorization data.

146

147

Returns:

148

Tuple of (data_array, labels)

149

"""

150

151

# Kiloword Dataset

152

kiloword.data_path: Callable[..., str] # Lexical decision task

153

kiloword.get_version: Callable[[], str]

154

155

def load_data(path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> Dict:

156

"""

157

Load kiloword dataset.

158

159

Returns:

160

Dictionary with epochs and metadata

161

"""

162

```

163

164

### Connectivity and Network Datasets

165

166

Datasets for studying brain connectivity and network analysis.

167

168

```python { .api }

169

# FieldTrip CMC Dataset

170

fieldtrip_cmc.data_path: Callable[..., str] # Cortico-muscular coherence

171

fieldtrip_cmc.get_version: Callable[[], str]

172

173

# mTRF Dataset

174

mtrf.data_path: Callable[..., str] # Multivariate temporal response functions

175

mtrf.get_version: Callable[[], str]

176

177

def load_speech_envelope(path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> Tuple[ArrayLike, float]:

178

"""

179

Load speech envelope stimulus.

180

181

Returns:

182

Tuple of (envelope_data, sampling_rate)

183

"""

184

```

185

186

### Phantom and Calibration Datasets

187

188

Datasets with known ground truth for method validation and calibration.

189

190

```python { .api }

191

# 4D BTi Phantom Dataset

192

phantom_4dbti.data_path: Callable[..., str] # 4D Neuroimaging phantom

193

phantom_4dbti.get_version: Callable[[], str]

194

195

# KIT Phantom Dataset

196

phantom_kit.data_path: Callable[..., str] # KIT/Yokogawa phantom data

197

phantom_kit.get_version: Callable[[], str]

198

199

# Kernel Phantom Dataset

200

phantom_kernel.data_path: Callable[..., str] # Kernel flow phantom

201

phantom_kernel.get_version: Callable[[], str]

202

203

def load_data(subject: str = 'phantom', session: str = '20220927_114934',

204

path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> Raw:

205

"""

206

Load phantom data directly as Raw object.

207

208

Parameters:

209

- subject: Subject identifier

210

- session: Session identifier

211

- path: Dataset path

212

- verbose: Verbosity level

213

214

Returns:

215

Raw object with phantom data

216

"""

217

```

218

219

### Standard Brain Templates and Atlases

220

221

Access to standard brain templates and parcellations.

222

223

```python { .api }

224

def fetch_fsaverage(subjects_dir: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> str:

225

"""

226

Fetch FreeSurfer average brain template.

227

228

Parameters:

229

- subjects_dir: FreeSurfer subjects directory

230

- verbose: Verbosity level

231

232

Returns:

233

Path to fsaverage directory

234

"""

235

236

def fetch_infant_template(age: str, subjects_dir: Optional[str] = None,

237

verbose: Optional[Union[bool, str, int]] = None) -> str:

238

"""

239

Fetch infant brain template.

240

241

Parameters:

242

- age: Age group ('6mo', '12mo', etc.)

243

- subjects_dir: FreeSurfer subjects directory

244

- verbose: Verbosity level

245

246

Returns:

247

Path to infant template

248

"""

249

250

def fetch_hcp_mmp_parcellation(subjects_dir: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> List[str]:

251

"""

252

Fetch HCP multi-modal parcellation.

253

254

Parameters:

255

- subjects_dir: FreeSurfer subjects directory

256

- verbose: Verbosity level

257

258

Returns:

259

List of paths to parcellation files

260

"""

261

262

def fetch_aparc_sub_parcellation(subjects_dir: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> List[str]:

263

"""

264

Fetch aparc sub-parcellation.

265

266

Parameters:

267

- subjects_dir: FreeSurfer subjects directory

268

- verbose: Verbosity level

269

270

Returns:

271

List of paths to sub-parcellation files

272

"""

273

```

274

275

### Dataset Utilities

276

277

Utility functions for dataset management and discovery.

278

279

```python { .api }

280

def has_dataset(name: str, path: Optional[str] = None) -> bool:

281

"""

282

Check if dataset is available locally.

283

284

Parameters:

285

- name: Dataset name

286

- path: Custom path to check

287

288

Returns:

289

True if dataset is available

290

"""

291

292

def get_version(name: str) -> str:

293

"""

294

Get version of specific dataset.

295

296

Parameters:

297

- name: Dataset name

298

299

Returns:

300

Version string

301

"""

302

303

def _download_all_example_data(path: Optional[str] = None, verbose: Optional[Union[bool, str, int]] = None) -> None:

304

"""

305

Download all example datasets (for CI/testing).

306

307

Parameters:

308

- path: Download path

309

- verbose: Verbosity level

310

"""

311

```

312

313

## Usage Examples

314

315

### Loading Sample Dataset

316

317

```python

318

import mne

319

320

# Download sample dataset (if not already present)

321

sample_data_folder = mne.datasets.sample.data_path()

322

print(f"Sample data location: {sample_data_folder}")

323

324

# Load sample data files

325

sample_data_raw_file = sample_data_folder / 'MEG' / 'sample' / 'sample_audvis_filt-0-40_raw.fif'

326

sample_data_cov_file = sample_data_folder / 'MEG' / 'sample' / 'sample_audvis-cov.fif'

327

sample_data_trans_file = sample_data_folder / 'MEG' / 'sample' / 'sample_audvis_raw-trans.fif'

328

329

# Load the actual data

330

raw = mne.io.read_raw_fif(sample_data_raw_file, preload=True)

331

cov = mne.read_cov(sample_data_cov_file)

332

333

print(f"Raw data: {raw}")

334

print(f"Covariance: {cov}")

335

```

336

337

### Motor Imagery Classification Data

338

339

```python

340

import mne

341

from mne.datasets import eegbci

342

343

# Load EEGBCI motor imagery data

344

eegbci_path = eegbci.data_path()

345

print(f"EEGBCI data location: {eegbci_path}")

346

347

# Load specific subject and runs

348

subject = 1

349

runs = [6, 10, 14] # Motor imagery runs

350

raw_fnames = eegbci.load_data(subject, runs)

351

352

# Load and concatenate runs

353

raws = [mne.io.read_raw_edf(f, preload=True) for f in raw_fnames]

354

raw = mne.concatenate_raws(raws)

355

356

# Set channel names to standard 10-20 system

357

mne.datasets.eegbci.standardize(raw)

358

359

# Set montage

360

montage = mne.channels.make_standard_montage('standard_1005')

361

raw.set_montage(montage)

362

363

print(f"Motor imagery data: {raw}")

364

```

365

366

### Using Phantom Data for Validation

367

368

```python

369

import mne

370

from mne.datasets import phantom_kit

371

372

# Load phantom dataset

373

phantom_path = phantom_kit.data_path()

374

print(f"Phantom data location: {phantom_path}")

375

376

# Phantom data has known dipole locations - useful for validation

377

phantom_raw_file = phantom_path / 'phantom_100hz_20_sec_raw.fif'

378

phantom_raw = mne.io.read_raw_fif(phantom_raw_file, preload=True)

379

380

# Load dipole information

381

phantom_dipoles_file = phantom_path / 'phantom_dipoles.txt'

382

# dipoles = load_phantom_dipoles(phantom_dipoles_file) # Custom function

383

384

print(f"Phantom raw data: {phantom_raw}")

385

```

386

387

### Sleep Dataset Analysis

388

389

```python

390

import mne

391

from mne.datasets import sleep_physionet

392

393

# Load sleep dataset

394

sleep_path = sleep_physionet.data_path()

395

print(f"Sleep data location: {sleep_path}")

396

397

# Load specific subject data

398

subjects = sleep_physionet.age_group_averages()

399

print(f"Available subjects: {len(subjects)}")

400

401

# Example loading one subject's data

402

# sleep_raw = mne.io.read_raw_edf(subjects[0], preload=True)

403

# print(f"Sleep recording: {sleep_raw}")

404

```

405

406

### Visual Categorization Dataset

407

408

```python

409

import mne

410

from mne.datasets import visual_92_categories

411

412

# Load visual categorization data

413

visual_path = visual_92_categories.data_path()

414

print(f"Visual data location: {visual_path}")

415

416

# Load preprocessed data

417

data, labels = visual_92_categories.load_data()

418

print(f"Data shape: {data.shape}")

419

print(f"Labels shape: {labels.shape}")

420

print(f"Unique categories: {len(np.unique(labels))}")

421

```

422

423

### FreeSurfer Template

424

425

```python

426

import mne

427

428

# Fetch FreeSurfer average brain

429

subjects_dir = mne.datasets.fetch_fsaverage(verbose=True)

430

print(f"fsaverage template: {subjects_dir}")

431

432

# Fetch HCP multi-modal parcellation

433

hcp_parcellation = mne.datasets.fetch_hcp_mmp_parcellation(subjects_dir=subjects_dir)

434

print(f"HCP parcellation files: {len(hcp_parcellation)}")

435

436

# Check if dataset is available

437

has_sample = mne.datasets.has_dataset('sample')

438

print(f"Sample dataset available: {has_sample}")

439

```

440

441

### Checking Dataset Availability

442

443

```python

444

import mne

445

446

# List of available datasets

447

datasets = [

448

'sample', 'somato', 'spm_face', 'eegbci', 'hf_sef',

449

'multimodal', 'opm', 'phantom_4dbti', 'visual_92_categories'

450

]

451

452

for dataset in datasets:

453

available = mne.datasets.has_dataset(dataset)

454

if hasattr(mne.datasets, dataset):

455

version = getattr(mne.datasets, dataset).get_version()

456

print(f"{dataset}: {'✓' if available else '✗'} (v{version})")

457

else:

458

print(f"{dataset}: {'✓' if available else '✗'}")

459

```

460

461

## Dataset Categories

462

463

### By Recording Modality

464

465

- **MEG**: sample, somato, multimodal, hf_sef, opm

466

- **EEG**: eegbci, spm_face, visual_92_categories, kiloword

467

- **ECoG**: epilepsy_ecog

468

- **fNIRS**: fnirs_motor

469

- **Sleep**: sleep_physionet

470

471

### By Experimental Paradigm

472

473

- **Sensory**: sample (auditory/visual), somato (somatosensory), hf_sef (tactile)

474

- **Motor**: eegbci (motor imagery), somato (motor responses)

475

- **Cognitive**: spm_face (face processing), visual_92_categories (object recognition)

476

- **Language**: kiloword (lexical decision)

477

- **Clinical**: epilepsy_ecog (seizure data), sleep_physionet (sleep disorders)

478

479

### By Use Case

480

481

- **Tutorials**: sample, somato, spm_face

482

- **Method validation**: phantom_4dbti, phantom_kit, phantom_kernel

483

- **BCI research**: eegbci, ssvep

484

- **Connectivity**: fieldtrip_cmc, mtrf

485

- **Templates**: fsaverage, infant templates, HCP parcellation

486

487

## Types

488

489

```python { .api }

490

from typing import Union, Optional, List, Dict, Tuple, Callable, Any

491

import numpy as np

492

493

ArrayLike = Union[np.ndarray, List, Tuple]

494

```