0
# Medical Image Datasets
1
2
Pre-built datasets for common medical imaging research, including brain atlases, public medical imaging challenges, and synthetic datasets for testing and development. These datasets provide standardized access to well-known medical imaging data.
3
4
## Capabilities
5
6
### Brain Atlases and Templates
7
8
Standard brain templates and atlases commonly used in neuroimaging research for registration, normalization, and analysis.
9
10
```python { .api }
11
class Colin27(Subject):
12
"""
13
Colin27 brain template.
14
15
Single-subject high-resolution T1-weighted brain template created from
16
27 scans of the same individual. Commonly used as reference for spatial
17
normalization and analysis.
18
"""
19
def __init__(self): ...
20
21
class ICBM2009CNonlinearSymmetric(Subject):
22
"""
23
ICBM 2009c nonlinear symmetric brain template.
24
25
Population-averaged brain template created from 152 T1-weighted images
26
of healthy adults. Part of the ICBM (International Consortium for Brain Mapping) initiative.
27
"""
28
def __init__(self): ...
29
30
class Pediatric(Subject):
31
"""
32
Pediatric brain template.
33
34
Age-appropriate brain template for pediatric neuroimaging studies,
35
representing typical brain anatomy in children.
36
"""
37
def __init__(self): ...
38
39
class Sheep(Subject):
40
"""
41
Sheep brain template.
42
43
Brain template for ovine (sheep) neuroimaging research,
44
useful for animal model studies and comparative neuroanatomy.
45
"""
46
def __init__(self): ...
47
```
48
49
Usage example:
50
51
```python
52
import torchio as tio
53
54
# Load brain templates
55
colin27 = tio.datasets.Colin27()
56
icbm_template = tio.datasets.ICBM2009CNonlinearSymmetric()
57
58
# Use as reference for registration
59
reference_space_transform = tio.ToReferenceSpace(target=colin27['t1'])
60
61
# Apply to other subjects
62
subject = tio.Subject(t1=tio.ScalarImage('patient_t1.nii.gz'))
63
registered = reference_space_transform(subject)
64
```
65
66
### Public Medical Imaging Datasets
67
68
Large-scale public datasets from medical imaging challenges and research initiatives, providing access to real clinical data for development and validation.
69
70
```python { .api }
71
class IXI(SubjectsDataset):
72
"""
73
IXI dataset - brain MR images from healthy subjects.
74
75
Collection of brain MRI scans from nearly 600 healthy subjects,
76
including T1, T2, PD-weighted, MRA, and DTI images.
77
78
Parameters:
79
- root: Root directory for dataset storage
80
- download: Whether to download if not present
81
"""
82
def __init__(self, root: TypePath, download: bool = True): ...
83
84
class IXITiny(SubjectsDataset):
85
"""
86
Tiny version of IXI dataset for testing and development.
87
88
Small subset of IXI dataset with reduced data size,
89
ideal for quick testing and development workflows.
90
91
Parameters:
92
- root: Root directory for dataset storage
93
- download: Whether to download if not present
94
"""
95
def __init__(self, root: TypePath, download: bool = True): ...
96
97
class RSNAMICCAI(SubjectsDataset):
98
"""
99
RSNA-MICCAI Brain Tumor Radiogenomic Classification dataset.
100
101
Multimodal brain MRI scans with brain tumor segmentations and
102
genetic mutation predictions from RSNA-ASNR-MICCAI challenge.
103
104
Parameters:
105
- root: Root directory for dataset storage
106
- download: Whether to download if not present
107
"""
108
def __init__(self, root: TypePath, download: bool = True): ...
109
110
class RSNACervicalSpineFracture(SubjectsDataset):
111
"""
112
RSNA Cervical Spine Fracture dataset.
113
114
CT images of cervical spine with fracture annotations
115
from RSNA cervical spine fracture detection challenge.
116
117
Parameters:
118
- root: Root directory for dataset storage
119
- download: Whether to download if not present
120
"""
121
def __init__(self, root: TypePath, download: bool = True): ...
122
123
class EPISURG(SubjectsDataset):
124
"""
125
EPISURG dataset for epilepsy surgery planning.
126
127
Multimodal MRI dataset for epilepsy research including
128
T1, FLAIR, and other specialized sequences.
129
130
Parameters:
131
- root: Root directory for dataset storage
132
- download: Whether to download if not present
133
"""
134
def __init__(self, root: TypePath, download: bool = True): ...
135
```
136
137
### ITK-SNAP Sample Datasets
138
139
Sample datasets distributed with ITK-SNAP software, commonly used for teaching and demonstration purposes.
140
141
```python { .api }
142
class BrainTumor(Subject):
143
"""
144
Brain tumor segmentation example from ITK-SNAP.
145
146
MRI brain scan with corresponding tumor segmentation,
147
commonly used for teaching medical image segmentation.
148
"""
149
def __init__(self): ...
150
151
class T1T2(Subject):
152
"""
153
T1 and T2 weighted brain MRI pair from ITK-SNAP.
154
155
Demonstrates multi-contrast brain imaging with
156
T1-weighted and T2-weighted MRI sequences.
157
"""
158
def __init__(self): ...
159
160
class AorticValve(Subject):
161
"""
162
Cardiac aortic valve imaging example from ITK-SNAP.
163
164
3D cardiac imaging data focused on aortic valve anatomy,
165
useful for cardiac image analysis applications.
166
"""
167
def __init__(self): ...
168
```
169
170
### MedMNIST 3D Datasets
171
172
3D versions of MedMNIST datasets, providing standardized benchmarks for 3D medical image analysis with classification and segmentation tasks.
173
174
```python { .api }
175
class OrganMNIST3D(SubjectsDataset):
176
"""
177
3D organ segmentation dataset.
178
179
3D CT scans with multi-organ segmentation labels,
180
derived from the Medical Segmentation Decathlon.
181
182
Parameters:
183
- root: Root directory for dataset storage
184
- split: Dataset split ('train', 'val', 'test')
185
- download: Whether to download if not present
186
"""
187
def __init__(
188
self,
189
root: TypePath,
190
split: str = 'train',
191
download: bool = True
192
): ...
193
194
class NoduleMNIST3D(SubjectsDataset):
195
"""
196
3D lung nodule detection dataset.
197
198
3D chest CT scans with pulmonary nodule annotations
199
for nodule detection and classification tasks.
200
201
Parameters:
202
- root: Root directory for dataset storage
203
- split: Dataset split ('train', 'val', 'test')
204
- download: Whether to download if not present
205
"""
206
def __init__(
207
self,
208
root: TypePath,
209
split: str = 'train',
210
download: bool = True
211
): ...
212
213
class AdrenalMNIST3D(SubjectsDataset):
214
"""
215
3D adrenal gland segmentation dataset.
216
217
3D CT scans with adrenal gland segmentation labels
218
for endocrine system imaging applications.
219
"""
220
def __init__(
221
self,
222
root: TypePath,
223
split: str = 'train',
224
download: bool = True
225
): ...
226
227
class FractureMNIST3D(SubjectsDataset):
228
"""
229
3D fracture detection dataset.
230
231
3D CT scans with fracture annotations for
232
orthopedic imaging and trauma assessment.
233
"""
234
def __init__(
235
self,
236
root: TypePath,
237
split: str = 'train',
238
download: bool = True
239
): ...
240
241
class VesselMNIST3D(SubjectsDataset):
242
"""
243
3D blood vessel segmentation dataset.
244
245
3D imaging data with vascular structure segmentations
246
for cardiovascular and vascular imaging applications.
247
"""
248
def __init__(
249
self,
250
root: TypePath,
251
split: str = 'train',
252
download: bool = True
253
): ...
254
255
class SynapseMNIST3D(SubjectsDataset):
256
"""
257
3D synapse segmentation dataset.
258
259
3D electron microscopy data with synaptic structure
260
segmentations for neuroscience applications.
261
"""
262
def __init__(
263
self,
264
root: TypePath,
265
split: str = 'train',
266
download: bool = True
267
): ...
268
```
269
270
### Research and Challenge Datasets
271
272
```python { .api }
273
class FPG(SubjectsDataset):
274
"""
275
Fernando Perez-Garcia research dataset.
276
277
Research dataset created by TorchIO's primary author,
278
useful for development and testing purposes.
279
280
Parameters:
281
- root: Root directory for dataset storage
282
- download: Whether to download if not present
283
"""
284
def __init__(self, root: TypePath, download: bool = True): ...
285
286
class Slicer(SubjectsDataset):
287
"""
288
3D Slicer sample data.
289
290
Sample medical imaging data distributed with 3D Slicer software,
291
covering various imaging modalities and anatomical regions.
292
293
Parameters:
294
- root: Root directory for dataset storage
295
- download: Whether to download if not present
296
"""
297
def __init__(self, root: TypePath, download: bool = True): ...
298
299
class BITE3(SubjectsDataset):
300
"""
301
BITE3 dataset for brain imaging.
302
303
Brain imaging dataset with specialized sequences
304
and annotations for neuroimaging research.
305
306
Parameters:
307
- root: Root directory for dataset storage
308
- download: Whether to download if not present
309
"""
310
def __init__(self, root: TypePath, download: bool = True): ...
311
312
class CtRate(SubjectsDataset):
313
"""
314
CT Rate dataset.
315
316
CT imaging dataset with rate-related annotations,
317
useful for temporal analysis and dynamic imaging studies.
318
319
Parameters:
320
- root: Root directory for dataset storage
321
- download: Whether to download if not present
322
"""
323
def __init__(self, root: TypePath, download: bool = True): ...
324
```
325
326
### Synthetic and Testing Datasets
327
328
Synthetic datasets for testing, development, and validation of image processing algorithms.
329
330
```python { .api }
331
class ZonePlate(Subject):
332
"""
333
Synthetic zone plate for testing and development.
334
335
Mathematical zone plate pattern useful for testing
336
image processing algorithms, transforms, and visualization.
337
338
Parameters:
339
- shape: Image shape (default: (100, 100, 100))
340
- spacing: Voxel spacing (default: (1, 1, 1))
341
"""
342
def __init__(
343
self,
344
shape: tuple[int, int, int] = (100, 100, 100),
345
spacing: tuple[float, float, float] = (1, 1, 1)
346
): ...
347
```
348
349
### Usage Examples
350
351
#### Loading and Using Public Datasets
352
353
```python
354
import torchio as tio
355
from pathlib import Path
356
357
# Download and load IXI dataset
358
data_dir = Path('./medical_data')
359
ixi_dataset = tio.datasets.IXI(root=data_dir, download=True)
360
361
print(f"IXI dataset contains {len(ixi_dataset)} subjects")
362
363
# Access individual subjects
364
subject = ixi_dataset[0]
365
print(f"Subject keys: {list(subject.keys())}")
366
367
# Use with transforms and data loaders
368
transforms = tio.Compose([
369
tio.ToCanonical(),
370
tio.Resample(2), # Downsample for faster processing
371
tio.ZNormalization(),
372
])
373
374
ixi_dataset.set_transform(transforms)
375
loader = tio.SubjectsLoader(ixi_dataset, batch_size=4, shuffle=True)
376
377
for batch in loader:
378
# Process batch of subjects
379
pass
380
```
381
382
#### Working with Brain Templates
383
384
```python
385
# Load standard brain template
386
colin27 = tio.datasets.Colin27()
387
388
# Use as reference for spatial normalization
389
def create_normalization_pipeline(reference_template):
390
return tio.Compose([
391
tio.ToCanonical(),
392
tio.ToReferenceSpace(target=reference_template['t1']),
393
tio.ZNormalization(),
394
])
395
396
normalization = create_normalization_pipeline(colin27)
397
398
# Apply to patient data
399
patient = tio.Subject(t1=tio.ScalarImage('patient_brain.nii.gz'))
400
normalized = normalization(patient)
401
```
402
403
#### MedMNIST 3D for Benchmarking
404
405
```python
406
# Load 3D medical classification dataset
407
organ_train = tio.datasets.OrganMNIST3D(
408
root='./medmnist_data',
409
split='train',
410
download=True
411
)
412
413
organ_val = tio.datasets.OrganMNIST3D(
414
root='./medmnist_data',
415
split='val',
416
download=False
417
)
418
419
# Create augmentation for training
420
train_transform = tio.Compose([
421
tio.RandomFlip(),
422
tio.RandomAffine(degrees=(-5, 5)),
423
tio.RandomNoise(std=(0, 0.05)),
424
])
425
426
# Apply transforms
427
organ_train.set_transform(train_transform)
428
429
# Use for model training
430
train_loader = tio.SubjectsLoader(
431
organ_train,
432
batch_size=16,
433
shuffle=True,
434
num_workers=4
435
)
436
437
for batch in train_loader:
438
images = batch['image'][tio.DATA]
439
labels = batch['label'][tio.DATA]
440
# Train model
441
```
442
443
#### Synthetic Data for Testing
444
445
```python
446
# Create synthetic test data
447
zone_plate = tio.datasets.ZonePlate(
448
shape=(64, 64, 64),
449
spacing=(1, 1, 1)
450
)
451
452
# Test transforms on synthetic data
453
test_transforms = tio.Compose([
454
tio.RandomAffine(degrees=(-10, 10)),
455
tio.RandomElasticDeformation(),
456
tio.RandomNoise(std=0.1),
457
])
458
459
augmented_zone_plate = test_transforms(zone_plate)
460
461
# Verify transform behavior
462
print(f"Original shape: {zone_plate.shape}")
463
print(f"Augmented shape: {augmented_zone_plate.shape}")
464
```