0
# Data Processing and Loading
1
2
Comprehensive data pipeline including datasets, transforms, augmentation strategies, and high-performance data loaders optimized for computer vision training and inference.
3
4
## Capabilities
5
6
### Data Loader Creation
7
8
High-performance data loaders with support for distributed training, mixed precision, and advanced augmentation techniques.
9
10
```python { .api }
11
def create_loader(
12
dataset,
13
input_size: Union[int, tuple],
14
batch_size: int,
15
is_training: bool = False,
16
use_prefetcher: bool = False,
17
no_aug: bool = False,
18
re_prob: float = 0.0,
19
re_mode: str = 'const',
20
re_count: int = 1,
21
re_num_splits: int = 0,
22
scale: tuple = (0.08, 1.0),
23
ratio: tuple = (3./4., 4./3.),
24
hflip: float = 0.5,
25
vflip: float = 0.0,
26
color_jitter: float = 0.4,
27
auto_augment: str = None,
28
num_aug_repeats: int = 0,
29
num_aug_splits: int = 0,
30
interpolation: str = 'bilinear',
31
mean: tuple = IMAGENET_DEFAULT_MEAN,
32
std: tuple = IMAGENET_DEFAULT_STD,
33
num_workers: int = 1,
34
distributed: bool = False,
35
collate_fn: Callable = None,
36
pin_memory: bool = False,
37
use_multi_epochs_loader: bool = False,
38
persistent_workers: bool = True,
39
worker_seeding: str = 'all',
40
**kwargs
41
) -> torch.utils.data.DataLoader:
42
"""
43
Create a DataLoader with TIMM's optimized configuration.
44
45
Args:
46
dataset: Dataset instance
47
input_size: Target input size (int or tuple)
48
batch_size: Batch size for training/inference
49
is_training: Training mode with augmentations
50
use_prefetcher: Use CUDA prefetcher for performance
51
no_aug: Disable augmentations
52
re_prob: Random erasing probability
53
re_mode: Random erasing mode ('const', 'rand', 'pixel')
54
re_count: Random erasing count
55
re_num_splits: Random erasing number of splits
56
scale: Random resized crop scale range
57
ratio: Random resized crop aspect ratio range
58
hflip: Horizontal flip probability
59
vflip: Vertical flip probability
60
color_jitter: Color jitter factor
61
auto_augment: AutoAugment policy ('original', 'originalr', 'v0', 'v0r')
62
num_aug_repeats: Number of augmentation repetitions
63
num_aug_splits: Number of augmentation splits
64
interpolation: Resize interpolation method
65
mean: Normalization mean values
66
std: Normalization standard deviation values
67
num_workers: Number of data loading workers
68
distributed: Enable distributed sampler
69
collate_fn: Custom collate function
70
pin_memory: Pin memory for GPU transfer
71
use_multi_epochs_loader: Use multi-epoch loader for efficiency
72
persistent_workers: Keep workers alive between epochs
73
worker_seeding: Worker random seeding strategy
74
75
Returns:
76
Configured DataLoader instance
77
"""
78
```
79
80
### Transform Creation
81
82
Comprehensive transform pipelines with support for training and inference configurations.
83
84
```python { .api }
85
def create_transform(
86
input_size: Union[int, tuple],
87
is_training: bool = False,
88
use_prefetcher: bool = False,
89
no_aug: bool = False,
90
scale: tuple = (0.08, 1.0),
91
ratio: tuple = (3./4., 4./3.),
92
hflip: float = 0.5,
93
vflip: float = 0.0,
94
color_jitter: float = 0.4,
95
auto_augment: str = None,
96
interpolation: str = 'bilinear',
97
mean: tuple = IMAGENET_DEFAULT_MEAN,
98
std: tuple = IMAGENET_DEFAULT_STD,
99
re_prob: float = 0.0,
100
re_mode: str = 'const',
101
re_count: int = 1,
102
re_num_splits: int = 0,
103
crop_pct: float = None,
104
tf_preprocessing: bool = False,
105
separate: bool = False,
106
**kwargs
107
):
108
"""
109
Create image transform pipeline.
110
111
Args:
112
input_size: Target input size for transforms
113
is_training: Use training transforms with augmentation
114
use_prefetcher: Skip normalization for CUDA prefetcher
115
no_aug: Disable all augmentations
116
scale: Random resized crop scale range
117
ratio: Random resized crop aspect ratio range
118
hflip: Horizontal flip probability
119
vflip: Vertical flip probability
120
color_jitter: Color jitter strength
121
auto_augment: AutoAugment policy name
122
interpolation: Resize interpolation method
123
mean: Normalization mean values
124
std: Normalization standard deviation values
125
re_prob: Random erasing probability
126
re_mode: Random erasing mode
127
re_count: Random erasing count
128
re_num_splits: Random erasing splits
129
crop_pct: Center crop percentage
130
tf_preprocessing: Use TensorFlow-style preprocessing
131
separate: Return transforms as separate list
132
133
Returns:
134
Transform function or list of transforms
135
"""
136
```
137
138
### Dataset Creation
139
140
Factory function for creating various dataset types including ImageNet, CIFAR, and custom datasets.
141
142
```python { .api }
143
def create_dataset(
144
name: str,
145
root: str,
146
split: str = 'validation',
147
is_training: bool = False,
148
class_map: dict = None,
149
load_bytes: bool = False,
150
img_mode: str = 'RGB',
151
transform: Callable = None,
152
target_transform: Callable = None,
153
**kwargs
154
):
155
"""
156
Create dataset instance.
157
158
Args:
159
name: Dataset name or path pattern
160
root: Root directory containing dataset
161
split: Dataset split ('train', 'validation', 'test')
162
is_training: Training mode configuration
163
class_map: Custom class mapping
164
load_bytes: Load images as bytes instead of PIL
165
img_mode: Image mode ('RGB', 'L', etc.)
166
transform: Image transforms
167
target_transform: Target/label transforms
168
**kwargs: Dataset-specific arguments
169
170
Returns:
171
Dataset instance
172
"""
173
```
174
175
## Dataset Classes
176
177
### Core Dataset Classes
178
179
```python { .api }
180
class ImageDataset(torch.utils.data.Dataset):
181
"""
182
Standard image dataset for classification tasks.
183
184
Args:
185
root: Root directory path
186
reader: Image reader instance
187
class_to_idx: Class name to index mapping
188
transform: Image transforms
189
target_transform: Target transforms
190
"""
191
192
def __init__(
193
self,
194
root: str,
195
reader: Optional[Any] = None,
196
class_to_idx: Optional[dict] = None,
197
transform: Optional[Callable] = None,
198
target_transform: Optional[Callable] = None
199
): ...
200
201
class IterableImageDataset(torch.utils.data.IterableDataset):
202
"""
203
Iterable image dataset for streaming large datasets.
204
205
Args:
206
root: Root directory or file pattern
207
reader: Image reader instance
208
split: Dataset split name
209
is_training: Training mode
210
batch_size: Batch size for iteration
211
transform: Image transforms
212
"""
213
214
def __init__(
215
self,
216
root: str,
217
reader: Optional[Any] = None,
218
split: str = 'train',
219
is_training: bool = False,
220
batch_size: Optional[int] = None,
221
transform: Optional[Callable] = None
222
): ...
223
224
class AugMixDataset(torch.utils.data.Dataset):
225
"""
226
Dataset wrapper for AugMix augmentation technique.
227
228
Args:
229
dataset: Base dataset
230
num_splits: Number of augmentation splits
231
alpha: Mixing parameter
232
width: Number of augmentation chains
233
depth: Depth of augmentation chains
234
blended: Use blended mixing
235
"""
236
237
def __init__(
238
self,
239
dataset,
240
num_splits: int = 2,
241
alpha: float = 1.0,
242
width: int = 3,
243
depth: int = -1,
244
blended: bool = False
245
): ...
246
```
247
248
## Transform Classes
249
250
### Basic Transforms
251
252
```python { .api }
253
class ToTensor:
254
"""Convert PIL Image to tensor."""
255
256
def __call__(self, pic): ...
257
258
class ToNumpy:
259
"""Convert tensor to numpy array."""
260
261
def __call__(self, tensor): ...
262
263
class RandomResizedCropAndInterpolation:
264
"""
265
Random resized crop with configurable interpolation.
266
267
Args:
268
size: Target output size
269
scale: Random crop scale range
270
ratio: Random crop aspect ratio range
271
interpolation: Interpolation method
272
"""
273
274
def __init__(
275
self,
276
size: Union[int, tuple],
277
scale: tuple = (0.08, 1.0),
278
ratio: tuple = (3./4., 4./3.),
279
interpolation: str = 'bilinear'
280
): ...
281
```
282
283
### Augmentation Transforms
284
285
```python { .api }
286
class RandAugment:
287
"""
288
RandAugment augmentation.
289
290
Args:
291
ops: List of augmentation operations
292
num_layers: Number of augmentation layers to apply
293
magnitude: Augmentation magnitude
294
"""
295
296
def __init__(
297
self,
298
ops: List[str],
299
num_layers: int = 2,
300
magnitude: int = 9
301
): ...
302
303
class AutoAugment:
304
"""
305
AutoAugment data augmentation.
306
307
Args:
308
policy: AutoAugment policy name
309
"""
310
311
def __init__(self, policy: str = 'original'): ...
312
313
class TrivialAugmentWide:
314
"""TrivialAugment Wide augmentation strategy."""
315
316
def __init__(self): ...
317
318
class Mixup:
319
"""
320
Mixup data augmentation.
321
322
Args:
323
mixup_alpha: Mixup interpolation coefficient
324
cutmix_alpha: CutMix interpolation coefficient
325
cutmix_minmax: CutMix min/max box size ratios
326
prob: Probability of applying mixup/cutmix
327
switch_prob: Probability of switching between mixup and cutmix
328
mode: Mixup mode ('batch', 'pair', 'elem')
329
correct_lam: Apply lambda correction
330
label_smoothing: Label smoothing value
331
num_classes: Number of classes
332
"""
333
334
def __init__(
335
self,
336
mixup_alpha: float = 1.0,
337
cutmix_alpha: float = 0.0,
338
cutmix_minmax: Optional[tuple] = None,
339
prob: float = 1.0,
340
switch_prob: float = 0.5,
341
mode: str = 'batch',
342
correct_lam: bool = True,
343
label_smoothing: float = 0.1,
344
num_classes: int = 1000
345
): ...
346
```
347
348
## Data Configuration
349
350
### Constants and Defaults
351
352
```python { .api }
353
# ImageNet normalization constants
354
IMAGENET_DEFAULT_MEAN: tuple = (0.485, 0.456, 0.406)
355
IMAGENET_DEFAULT_STD: tuple = (0.229, 0.224, 0.225)
356
357
# ImageNet Inception normalization
358
IMAGENET_INCEPTION_MEAN: tuple = (0.5, 0.5, 0.5)
359
IMAGENET_INCEPTION_STD: tuple = (0.5, 0.5, 0.5)
360
361
# OpenAI CLIP normalization
362
OPENAI_CLIP_MEAN: tuple = (0.48145466, 0.4578275, 0.40821073)
363
OPENAI_CLIP_STD: tuple = (0.26862954, 0.26130258, 0.27577711)
364
```
365
366
### Configuration Functions
367
368
```python { .api }
369
def resolve_data_config(
370
args=None,
371
pretrained_cfg: dict = None,
372
model: torch.nn.Module = None,
373
use_test_size: bool = False,
374
verbose: bool = False
375
) -> dict:
376
"""
377
Resolve data configuration from model, args, or defaults.
378
379
Args:
380
args: Argument namespace with data config
381
pretrained_cfg: Pretrained model configuration
382
model: Model instance to extract config from
383
use_test_size: Use test/inference input size
384
verbose: Print resolved configuration
385
386
Returns:
387
Dictionary with resolved data configuration
388
"""
389
390
def resolve_model_data_config(
391
model: torch.nn.Module,
392
args=None,
393
pretrained_cfg: dict = None,
394
use_test_size: bool = False,
395
verbose: bool = False
396
) -> dict:
397
"""
398
Resolve data configuration specifically from model.
399
400
Args:
401
model: Model instance
402
args: Additional arguments
403
pretrained_cfg: Pretrained configuration override
404
use_test_size: Use inference input size
405
verbose: Print configuration details
406
407
Returns:
408
Model-specific data configuration
409
"""
410
```
411
412
## Data Readers
413
414
### Image Readers
415
416
```python { .api }
417
def create_reader(
418
name: str,
419
root: str,
420
split: str = 'train',
421
**kwargs
422
):
423
"""
424
Create image reader for different data formats.
425
426
Args:
427
name: Reader type ('', 'hfds', 'tfds', 'wds')
428
root: Data root path
429
split: Dataset split
430
**kwargs: Reader-specific arguments
431
432
Returns:
433
Configured reader instance
434
"""
435
436
def get_img_extensions() -> set:
437
"""
438
Get supported image file extensions.
439
440
Returns:
441
Set of supported extensions
442
"""
443
444
def is_img_extension(filename: str) -> bool:
445
"""
446
Check if filename has supported image extension.
447
448
Args:
449
filename: File name to check
450
451
Returns:
452
True if supported image format
453
"""
454
```
455
456
## Usage Examples
457
458
### Basic Data Pipeline
459
460
```python
461
import timm
462
from timm.data import create_loader, create_transform, create_dataset
463
464
# Create transforms for training and validation
465
train_transform = create_transform(
466
input_size=224,
467
is_training=True,
468
hflip=0.5,
469
color_jitter=0.4,
470
auto_augment='original'
471
)
472
473
val_transform = create_transform(
474
input_size=224,
475
is_training=False
476
)
477
478
# Create datasets
479
train_dataset = create_dataset(
480
'imagefolder',
481
root='/path/to/train',
482
transform=train_transform
483
)
484
485
val_dataset = create_dataset(
486
'imagefolder',
487
root='/path/to/val',
488
transform=val_transform
489
)
490
491
# Create data loaders
492
train_loader = create_loader(
493
train_dataset,
494
input_size=224,
495
batch_size=32,
496
is_training=True,
497
num_workers=4
498
)
499
500
val_loader = create_loader(
501
val_dataset,
502
input_size=224,
503
batch_size=64,
504
is_training=False,
505
num_workers=4
506
)
507
```
508
509
### Advanced Augmentation
510
511
```python
512
from timm.data import Mixup, create_loader
513
514
# Create mixup augmentation
515
mixup = Mixup(
516
mixup_alpha=0.8,
517
cutmix_alpha=1.0,
518
prob=1.0,
519
switch_prob=0.5,
520
mode='batch',
521
label_smoothing=0.1,
522
num_classes=1000
523
)
524
525
# Create loader with advanced augmentation
526
train_loader = create_loader(
527
dataset,
528
input_size=224,
529
batch_size=32,
530
is_training=True,
531
auto_augment='rand-m9-mstd0.5-inc1',
532
re_prob=0.25,
533
mixup_alpha=0.8,
534
cutmix_alpha=1.0
535
)
536
537
# Apply mixup in training loop
538
for batch_idx, (input, target) in enumerate(train_loader):
539
if mixup is not None:
540
input, target = mixup(input, target)
541
# ... training code
542
```
543
544
## Types
545
546
```python { .api }
547
from typing import Optional, Union, List, Dict, Callable, Any, Tuple
548
import torch
549
550
# Transform and dataset types
551
TransformType = Callable[[Any], torch.Tensor]
552
DatasetType = torch.utils.data.Dataset
553
LoaderType = torch.utils.data.DataLoader
554
555
# Data configuration
556
DataConfig = Dict[str, Any]
557
AugmentConfig = Dict[str, Any]
558
559
# Common type aliases
560
ImageSize = Union[int, Tuple[int, int]]
561
NormStats = Tuple[float, float, float]
562
```