0
# Lightning
1
2
The Deep Learning framework to train, deploy, and ship AI products Lightning fast. Lightning provides a unified interface combining PyTorch Lightning (for high-level model training) with Lightning Fabric (for expert-level control) and data utilities, enabling researchers and practitioners to build production-ready deep learning applications at scale.
3
4
## Package Information
5
6
- **Package Name**: lightning
7
- **Package Type**: pypi
8
- **Language**: Python
9
- **Installation**: `pip install lightning`
10
11
## Core Imports
12
13
```python
14
import lightning as L
15
```
16
17
Main framework components:
18
19
```python
20
from lightning import Trainer, LightningModule, LightningDataModule, Callback
21
```
22
23
Lightweight acceleration:
24
25
```python
26
from lightning import Fabric
27
```
28
29
Utilities:
30
31
```python
32
from lightning import seed_everything
33
from lightning.pytorch.utilities.warnings import disable_possible_user_warnings
34
```
35
36
## Basic Usage
37
38
```python
39
import lightning as L
40
import torch
41
import torch.nn as nn
42
import torch.nn.functional as F
43
from torch.utils.data import DataLoader, random_split
44
from torchvision import transforms
45
from torchvision.datasets import MNIST
46
47
# Define a Lightning Module
48
class LitModel(L.LightningModule):
49
def __init__(self):
50
super().__init__()
51
self.layer_1 = nn.Linear(28 * 28, 128)
52
self.layer_2 = nn.Linear(128, 10)
53
54
def forward(self, x):
55
x = x.view(x.size(0), -1)
56
x = torch.relu(self.layer_1(x))
57
x = self.layer_2(x)
58
return x
59
60
def training_step(self, batch, batch_idx):
61
x, y = batch
62
y_hat = self(x)
63
loss = F.cross_entropy(y_hat, y)
64
return loss
65
66
def configure_optimizers(self):
67
return torch.optim.Adam(self.parameters())
68
69
# Define a Data Module
70
class MNISTDataModule(L.LightningDataModule):
71
def __init__(self, data_dir: str = './'):
72
super().__init__()
73
self.data_dir = data_dir
74
self.transform = transforms.Compose([
75
transforms.ToTensor(),
76
transforms.Normalize((0.1307,), (0.3081,))
77
])
78
79
def prepare_data(self):
80
MNIST(self.data_dir, train=True, download=True)
81
MNIST(self.data_dir, train=False, download=True)
82
83
def setup(self, stage: str):
84
if stage == "fit":
85
mnist_full = MNIST(self.data_dir, train=True, transform=self.transform)
86
self.mnist_train, self.mnist_val = random_split(mnist_full, [55000, 5000])
87
if stage == "test":
88
self.mnist_test = MNIST(self.data_dir, train=False, transform=self.transform)
89
90
def train_dataloader(self):
91
return DataLoader(self.mnist_train, batch_size=32)
92
93
def val_dataloader(self):
94
return DataLoader(self.mnist_val, batch_size=32)
95
96
def test_dataloader(self):
97
return DataLoader(self.mnist_test, batch_size=32)
98
99
# Train the model
100
if __name__ == "__main__":
101
model = LitModel()
102
datamodule = MNISTDataModule()
103
trainer = L.Trainer(max_epochs=10)
104
trainer.fit(model, datamodule)
105
```
106
107
## Architecture
108
109
Lightning provides a layered architecture designed for maximum flexibility and production readiness:
110
111
- **Lightning Fabric**: Low-level acceleration layer providing expert control over training loops, device management, and distributed strategies
112
- **PyTorch Lightning**: High-level framework built on Fabric, offering structured training workflows with automatic optimization, logging, and checkpointing
113
- **Unified Interface**: Single package combining both approaches, allowing users to choose the right abstraction level
114
- **Data Integration**: Built-in streaming data capabilities through litdata integration
115
- **Production Features**: Multi-GPU/multi-node training, cloud deployment, extensive logging, and MLOps integrations
116
117
This design enables seamless transitions from research prototyping to production deployment while maintaining code reusability and scalability.
118
119
## Capabilities
120
121
### Core Training Components
122
123
Essential components for structuring deep learning training: the Trainer orchestrator, LightningModule for model definition, LightningDataModule for data handling, and Callback system for training lifecycle hooks.
124
125
```python { .api }
126
class Trainer:
127
def __init__(self, **kwargs): ...
128
def fit(self, model, datamodule=None, train_dataloaders=None, val_dataloaders=None, **kwargs): ...
129
def test(self, model=None, dataloaders=None, **kwargs): ...
130
def predict(self, model=None, dataloaders=None, **kwargs): ...
131
132
class LightningModule:
133
def __init__(self): ...
134
def forward(self, *args, **kwargs): ...
135
def training_step(self, batch, batch_idx): ...
136
def validation_step(self, batch, batch_idx): ...
137
def test_step(self, batch, batch_idx): ...
138
def configure_optimizers(self): ...
139
140
class LightningDataModule:
141
def __init__(self): ...
142
def prepare_data(self): ...
143
def setup(self, stage: str): ...
144
def train_dataloader(self): ...
145
def val_dataloader(self): ...
146
def test_dataloader(self): ...
147
148
class Callback:
149
def on_train_start(self, trainer, pl_module): ...
150
def on_train_end(self, trainer, pl_module): ...
151
def on_epoch_start(self, trainer, pl_module): ...
152
def on_epoch_end(self, trainer, pl_module): ...
153
```
154
155
[Core Training Components](./core-training.md)
156
157
### Lightning Fabric
158
159
Lightweight training acceleration framework providing expert-level control over training loops, device management, and distributed strategies without high-level abstractions.
160
161
```python { .api }
162
class Fabric:
163
def __init__(self, **kwargs): ...
164
def setup(self, model, *optimizers): ...
165
def setup_dataloaders(self, *dataloaders): ...
166
def backward(self, tensor): ...
167
def all_gather(self, tensor): ...
168
def broadcast(self, tensor): ...
169
170
def seed_everything(seed: int): ...
171
def is_wrapped(obj): ...
172
```
173
174
[Lightning Fabric](./fabric.md)
175
176
### Callbacks and Lifecycle Hooks
177
178
Comprehensive callback system for training lifecycle management including checkpointing, early stopping, learning rate scheduling, monitoring, and optimization callbacks.
179
180
```python { .api }
181
class ModelCheckpoint(Callback):
182
def __init__(self, dirpath=None, filename=None, monitor=None, **kwargs): ...
183
184
class EarlyStopping(Callback):
185
def __init__(self, monitor, patience=3, **kwargs): ...
186
187
class LearningRateMonitor(Callback):
188
def __init__(self, logging_interval='epoch'): ...
189
190
class StochasticWeightAveraging(Callback):
191
def __init__(self, swa_lrs=None, **kwargs): ...
192
```
193
194
[Callbacks and Lifecycle Hooks](./callbacks.md)
195
196
### Distributed Training Strategies
197
198
Multiple strategies for distributed and parallel training including data parallel, distributed data parallel, fully sharded data parallel, model parallel, and specialized strategies for different hardware.
199
200
```python { .api }
201
class DDPStrategy:
202
def __init__(self, **kwargs): ...
203
204
class FSDPStrategy:
205
def __init__(self, **kwargs): ...
206
207
class DeepSpeedStrategy:
208
def __init__(self, **kwargs): ...
209
210
class DataParallelStrategy:
211
def __init__(self): ...
212
```
213
214
[Distributed Training Strategies](./strategies.md)
215
216
### Hardware Acceleration
217
218
Support for various hardware accelerators including CPU, CUDA GPUs, Apple Metal Performance Shaders, and Google TPUs with automatic device detection and optimization.
219
220
```python { .api }
221
class CPUAccelerator:
222
def setup_device(self, device): ...
223
224
class CUDAAccelerator:
225
def setup_device(self, device): ...
226
227
class MPSAccelerator:
228
def setup_device(self, device): ...
229
230
class XLAAccelerator:
231
def setup_device(self, device): ...
232
233
def find_usable_cuda_devices(num_gpus: int = -1): ...
234
```
235
236
[Hardware Acceleration](./accelerators.md)
237
238
### Precision Control and Optimization
239
240
Precision plugins for mixed precision training, quantization, and various floating-point formats to optimize memory usage and training speed while maintaining model quality.
241
242
```python { .api }
243
class MixedPrecision:
244
def __init__(self, precision='16-mixed', **kwargs): ...
245
246
class HalfPrecision:
247
def __init__(self): ...
248
249
class DoublePrecision:
250
def __init__(self): ...
251
252
class BitsandbytesPrecision:
253
def __init__(self, mode='int8', **kwargs): ...
254
```
255
256
[Precision Control](./precision.md)
257
258
### Logging and Monitoring
259
260
Integration with popular experiment tracking platforms and comprehensive logging capabilities for monitoring training progress, metrics, hyperparameters, and model artifacts.
261
262
```python { .api }
263
class TensorBoardLogger:
264
def __init__(self, save_dir, **kwargs): ...
265
266
class WandbLogger:
267
def __init__(self, project=None, **kwargs): ...
268
269
class MLFlowLogger:
270
def __init__(self, experiment_name=None, **kwargs): ...
271
272
class CSVLogger:
273
def __init__(self, save_dir, **kwargs): ...
274
```
275
276
[Logging and Monitoring](./loggers.md)
277
278
### Profiling and Performance Analysis
279
280
Profiling tools for analyzing training performance, identifying bottlenecks, and optimizing model training efficiency across different hardware configurations.
281
282
```python { .api }
283
class PyTorchProfiler:
284
def __init__(self, **kwargs): ...
285
286
class AdvancedProfiler:
287
def __init__(self, **kwargs): ...
288
289
class SimpleProfiler:
290
def __init__(self): ...
291
```
292
293
[Profiling and Performance](./profilers.md)
294
295
### Data Utilities
296
297
Data handling utilities including streaming datasets, combined data loaders, and data processing functions for efficient data pipeline management in large-scale training.
298
299
```python { .api }
300
class StreamingDataset:
301
def __init__(self, **kwargs): ...
302
303
class CombinedStreamingDataset:
304
def __init__(self, datasets, **kwargs): ...
305
306
def optimize(data_dir, **kwargs): ...
307
def map(function, inputs, **kwargs): ...
308
```
309
310
[Data Utilities](./data.md)
311
312
## Utilities
313
314
Common utilities for training control and configuration.
315
316
```python { .api }
317
def seed_everything(seed: int, workers: bool = False) -> int: ...
318
def disable_possible_user_warnings() -> None: ...
319
```
320
321
## Types
322
323
```python { .api }
324
from typing import Any, Dict, List, Optional, Union
325
from torch import Tensor
326
from torch.nn import Module
327
from torch.optim import Optimizer
328
from torch.utils.data import DataLoader
329
330
# Core types
331
STEP_OUTPUT = Union[Tensor, Dict[str, Any]]
332
TRAIN_DATALOADERS = Union[DataLoader, List[DataLoader], Dict[str, DataLoader]]
333
EVAL_DATALOADERS = Union[DataLoader, List[DataLoader]]
334
_EVALUATE_OUTPUT = List[Dict[str, float]]
335
_PREDICT_OUTPUT = List[Any]
336
337
# LR Scheduler configuration
338
class LRSchedulerConfig:
339
scheduler: Any
340
interval: str = "epoch"
341
frequency: int = 1
342
monitor: Optional[str] = None
343
strict: bool = True
344
name: Optional[str] = None
345
346
# Enums
347
class GradClipAlgorithmType:
348
NORM = "norm"
349
VALUE = "value"
350
351
class LightningEnum:
352
pass
353
354
# Constants
355
FLOAT16_EPSILON: float
356
FLOAT32_EPSILON: float
357
FLOAT64_EPSILON: float
358
```