Deep learning framework providing tensor computation with GPU acceleration and dynamic neural networks with automatic differentiation
npx @tessl/cli install tessl/pypi-torch@2.8.00
# PyTorch
1
2
PyTorch is a comprehensive deep learning framework that provides tensor computation with strong GPU acceleration and dynamic neural networks built on a tape-based autograd system. It offers a Python-first approach to machine learning, allowing researchers and developers to build and train neural networks using familiar Python syntax while maintaining high performance through optimized C++ and CUDA backends.
3
4
## Package Information
5
6
- **Package Name**: torch
7
- **Language**: Python
8
- **Installation**: `pip install torch`
9
- **GPU Support**: `pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118`
10
11
## Core Imports
12
13
```python
14
import torch
15
```
16
17
Common additional imports:
18
19
```python
20
import torch.nn as nn
21
import torch.optim as optim
22
import torch.nn.functional as F
23
from torch.utils.data import DataLoader, Dataset
24
```
25
26
## Basic Usage
27
28
```python
29
import torch
30
import torch.nn as nn
31
import torch.optim as optim
32
33
# Create tensors
34
x = torch.tensor([[1.0, 2.0], [3.0, 4.0]], requires_grad=True)
35
y = torch.tensor([[5.0], [6.0]])
36
37
# Define a simple neural network
38
class SimpleNet(nn.Module):
39
def __init__(self):
40
super(SimpleNet, self).__init__()
41
self.linear = nn.Linear(2, 1)
42
43
def forward(self, x):
44
return self.linear(x)
45
46
# Initialize model, loss function, and optimizer
47
model = SimpleNet()
48
criterion = nn.MSELoss()
49
optimizer = optim.SGD(model.parameters(), lr=0.01)
50
51
# Forward pass
52
output = model(x)
53
loss = criterion(output, y)
54
55
# Backward pass and optimization
56
optimizer.zero_grad()
57
loss.backward()
58
optimizer.step()
59
60
print(f"Loss: {loss.item()}")
61
print(f"Gradients: {x.grad}")
62
```
63
64
## Architecture
65
66
PyTorch's design centers around dynamic computation graphs and the autograd system:
67
68
- **Tensors**: Multi-dimensional arrays with automatic differentiation support
69
- **Autograd**: Automatic differentiation engine that records operations for backpropagation
70
- **nn.Module**: Base class for neural network components with parameter management
71
- **Optimizers**: Algorithms for updating model parameters during training
72
- **Device Abstraction**: Unified interface for CPU, CUDA, MPS, and XPU backends
73
- **JIT Compilation**: TorchScript for optimizing models for deployment
74
75
This architecture enables rapid prototyping in research while scaling to production deployments across various hardware platforms.
76
77
## Capabilities
78
79
### Core Tensor Operations
80
81
Fundamental tensor creation, manipulation, and mathematical operations. Tensors are the primary data structure supporting automatic differentiation and GPU acceleration.
82
83
```python { .api }
84
def tensor(data, *, dtype=None, device=None, requires_grad=False, pin_memory=False) -> Tensor: ...
85
def zeros(*size, dtype=None, device=None, requires_grad=False) -> Tensor: ...
86
def ones(*size, dtype=None, device=None, requires_grad=False) -> Tensor: ...
87
def rand(*size, dtype=None, device=None, requires_grad=False) -> Tensor: ...
88
def randn(*size, dtype=None, device=None, requires_grad=False) -> Tensor: ...
89
def arange(start=0, end, step=1, *, dtype=None, device=None, requires_grad=False) -> Tensor: ...
90
def linspace(start, end, steps, *, dtype=None, device=None, requires_grad=False) -> Tensor: ...
91
```
92
93
[Tensor Operations](./tensor-operations.md)
94
95
### Neural Networks
96
97
Complete neural network building blocks including layers, activation functions, loss functions, and containers for building deep learning models.
98
99
```python { .api }
100
class Module:
101
def forward(self, *input): ...
102
def parameters(self, recurse=True): ...
103
def named_parameters(self, prefix='', recurse=True): ...
104
def zero_grad(self, set_to_none=False): ...
105
106
class Linear(Module):
107
def __init__(self, in_features: int, out_features: int, bias: bool = True): ...
108
109
class Conv2d(Module):
110
def __init__(self, in_channels: int, out_channels: int, kernel_size, stride=1, padding=0): ...
111
112
class ReLU(Module):
113
def __init__(self, inplace: bool = False): ...
114
115
class CrossEntropyLoss(Module):
116
def __init__(self, weight=None, size_average=None, ignore_index=-100): ...
117
```
118
119
[Neural Networks](./neural-networks.md)
120
121
### Training and Optimization
122
123
Optimizers, learning rate schedulers, and training utilities for model optimization and parameter updates.
124
125
```python { .api }
126
class Optimizer:
127
def step(self, closure=None): ...
128
def zero_grad(self, set_to_none=False): ...
129
130
class SGD(Optimizer):
131
def __init__(self, params, lr, momentum=0, dampening=0, weight_decay=0): ...
132
133
class Adam(Optimizer):
134
def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0): ...
135
136
class StepLR:
137
def __init__(self, optimizer, step_size, gamma=0.1): ...
138
def step(self, epoch=None): ...
139
```
140
141
[Training and Optimization](./training.md)
142
143
### Mathematical Functions
144
145
Comprehensive mathematical operations including linear algebra, FFT, special functions, and statistical operations.
146
147
```python { .api }
148
def matmul(input: Tensor, other: Tensor) -> Tensor: ...
149
def dot(input: Tensor, other: Tensor) -> Tensor: ...
150
def sum(input: Tensor, dim=None, keepdim=False, *, dtype=None) -> Tensor: ...
151
def mean(input: Tensor, dim=None, keepdim=False, *, dtype=None) -> Tensor: ...
152
def std(input: Tensor, dim=None, keepdim=False, *, dtype=None) -> Tensor: ...
153
def max(input: Tensor, dim=None, keepdim=False) -> Tensor: ...
154
def min(input: Tensor, dim=None, keepdim=False) -> Tensor: ...
155
```
156
157
[Mathematical Functions](./mathematical-functions.md)
158
159
### Device and Distributed Computing
160
161
Device management, CUDA operations, distributed training, and multi-GPU support for scaling deep learning workloads.
162
163
```python { .api }
164
def cuda.is_available() -> bool: ...
165
def cuda.device_count() -> int: ...
166
def cuda.get_device_name(device=None) -> str: ...
167
def cuda.set_device(device): ...
168
169
class DistributedDataParallel(Module):
170
def __init__(self, module, device_ids=None, output_device=None): ...
171
172
def distributed.init_process_group(backend, init_method=None, timeout=default_pg_timeout): ...
173
def distributed.all_reduce(tensor, op=ReduceOp.SUM, group=None, async_op=False): ...
174
```
175
176
[Device and Distributed Computing](./devices-distributed.md)
177
178
### Advanced Features
179
180
JIT compilation, model export, graph transformations, quantization, and deployment utilities for optimizing and deploying models.
181
182
```python { .api }
183
def jit.script(obj, optimize=None, _frames_up=0, _rcb=None): ...
184
def jit.trace(func, example_inputs, optimize=None, check_trace=True): ...
185
186
def export.export(mod: torch.nn.Module, args, kwargs=None, *, dynamic_shapes=None): ...
187
188
def compile(model=None, *, fullgraph=False, dynamic=None, backend="inductor"): ...
189
190
def quantization.quantize_dynamic(model, qconfig_spec=None, dtype=torch.qint8): ...
191
```
192
193
[Advanced Features](./advanced-features.md)
194
195
## Core Types
196
197
```python { .api }
198
class Tensor:
199
"""Multi-dimensional array with automatic differentiation support."""
200
def __init__(self, data, *, dtype=None, device=None, requires_grad=False): ...
201
def backward(self, gradient=None, retain_graph=None, create_graph=False): ...
202
def detach(self) -> Tensor: ...
203
def numpy(self) -> numpy.ndarray: ...
204
def cuda(self, device=None, non_blocking=False) -> Tensor: ...
205
def cpu(self) -> Tensor: ...
206
def to(self, *args, **kwargs) -> Tensor: ...
207
def size(self, dim=None): ...
208
def shape(self) -> torch.Size: ...
209
def dim(self) -> int: ...
210
def numel(self) -> int: ...
211
def item(self) -> number: ...
212
def clone(self) -> Tensor: ...
213
def requires_grad_(self, requires_grad=True) -> Tensor: ...
214
215
class dtype:
216
"""Data type specification for tensors."""
217
float32: dtype
218
float64: dtype
219
int32: dtype
220
int64: dtype
221
bool: dtype
222
uint8: dtype
223
224
class device:
225
"""Device specification for tensor placement."""
226
def __init__(self, device): ...
227
228
class Size(tuple):
229
"""Tensor shape representation."""
230
def numel(self) -> int: ...
231
232
class Generator:
233
"""Random number generator state."""
234
def manual_seed(self, seed: int) -> Generator: ...
235
def get_state(self) -> Tensor: ...
236
def set_state(self, new_state: Tensor) -> Generator: ...
237
```