Tessl Tile for pypi/torchvision@0.23.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

datasets.md index.md io.md models.md ops.md transforms.md tv_tensors.md utils.md

index.mddocs/

0
# TorchVision
1

2
TorchVision is a computer vision library for PyTorch that provides datasets, model architectures, and computer vision transforms. It offers a comprehensive toolkit for building computer vision applications with pre-trained models, data loading utilities, and image/video processing capabilities.
3

4
## Package Information
5

6
- **Package Name**: torchvision
7
- **Language**: Python
8
- **Installation**: `pip install torchvision`
9
- **Version**: 0.23.0
10

11
## Core Imports
12

13
```python
14
import torchvision
15
from torchvision import datasets, models, transforms, utils, io, ops, tv_tensors
16
```
17

18
Common patterns:
19

20
```python
21
import torchvision.transforms as transforms
22
import torchvision.models as models
23
from torchvision.datasets import CIFAR10, ImageNet
24
```
25

26
## Basic Usage
27

28
```python
29
import torch
30
import torchvision.transforms as transforms
31
from torchvision import models, datasets
32
from torch.utils.data import DataLoader
33

34
# Load a pre-trained model
35
model = models.resnet50(weights='DEFAULT')
36
model.eval()
37

38
# Create transform pipeline
39
transform = transforms.Compose([
40
    transforms.Resize(256),
41
    transforms.CenterCrop(224),
42
    transforms.ToTensor(),
43
    transforms.Normalize(mean=[0.485, 0.456, 0.406], 
44
                        std=[0.229, 0.224, 0.225])
45
])
46

47
# Load dataset
48
dataset = datasets.CIFAR10(root='./data', train=False, 
49
                          download=True, transform=transform)
50
dataloader = DataLoader(dataset, batch_size=32, shuffle=False)
51

52
# Inference
53
with torch.no_grad():
54
    for images, labels in dataloader:
55
        outputs = model(images)
56
        predictions = torch.argmax(outputs, dim=1)
57
        break
58
```
59

60
## Architecture
61

62
TorchVision is organized into several key modules:
63

64
- **Models**: Pre-trained neural networks for classification, detection, segmentation, and video tasks
65
- **Datasets**: Standard computer vision datasets with automatic download and preprocessing
66
- **Transforms**: Image and video processing operations with v1/v2 APIs for different data types
67
- **Utils**: Visualization utilities and tensor operations
68
- **I/O**: Image and video reading/writing operations with format support
69
- **Ops**: Low-level operations for object detection, segmentation, and custom layers
70
- **TV Tensors**: Enhanced tensor types that preserve metadata through transformations
71

72
## Capabilities
73

74
### Global Configuration
75

76
Core TorchVision configuration functions for backend management.
77

78
```python { .api }
79
def set_image_backend(backend: str) -> None:
80
    """Set the image loading backend ('PIL' or 'accimage')."""
81

82
def get_image_backend() -> str:
83
    """Get the current image backend."""
84

85
def set_video_backend(backend: str) -> None:
86
    """Set the video decoding backend ('pyav', 'video_reader', or 'cuda')."""
87

88
def get_video_backend() -> str:
89
    """Get the current video backend."""
90

91
def disable_beta_transforms_warning() -> None:
92
    """Disable beta transforms warning (legacy compatibility function)."""
93
```
94

95
### Datasets
96

97
Comprehensive collection of computer vision datasets with automatic downloading and preprocessing capabilities. Includes image classification, object detection, segmentation, and video datasets.
98

99
```python { .api }
100
class VisionDataset:
101
    """Base class for all vision datasets."""
102
    
103
class ImageFolder(VisionDataset):
104
    """Data loader for image classification datasets in folder format."""
105
    
106
class CIFAR10(VisionDataset):
107
    """CIFAR-10 dataset."""
108
    
109
class ImageNet(VisionDataset):
110
    """ImageNet dataset."""
111
    
112
class CocoDetection(VisionDataset):
113
    """COCO dataset for object detection."""
114
```
115

116
[Datasets](./datasets.md)
117

118
### Models
119

120
Pre-trained neural network models for various computer vision tasks including classification, object detection, instance segmentation, semantic segmentation, and video understanding.
121

122
```python { .api }
123
def get_model(name: str, **config) -> torch.nn.Module:
124
    """Get model by name with configuration."""
125

126
def list_models() -> list[str]:
127
    """List all available models."""
128

129
def resnet50(weights=None, progress: bool = True, **kwargs) -> torch.nn.Module:
130
    """ResNet-50 model."""
131

132
def fasterrcnn_resnet50_fpn(weights=None, progress: bool = True, **kwargs) -> torch.nn.Module:
133
    """Faster R-CNN with ResNet-50-FPN backbone."""
134
```
135

136
[Models](./models.md)
137

138
### Transforms
139

140
Image and video preprocessing and augmentation operations. Includes both v1 (PIL/tensor) and v2 (multi-tensor) APIs for different data types.
141

142
```python { .api }
143
class Compose:
144
    """Composes several transforms together."""
145
    
146
class Resize:
147
    """Resize image to given size."""
148
    
149
class ToTensor:
150
    """Convert PIL Image or numpy array to tensor."""
151
    
152
class Normalize:
153
    """Normalize tensor with mean and std."""
154
    
155
class RandomHorizontalFlip:
156
    """Randomly flip image horizontally."""
157
```
158

159
[Transforms](./transforms.md)
160

161
### Utils
162

163
Visualization utilities and tensor operations for working with images, bounding boxes, masks, and keypoints.
164

165
```python { .api }
166
def make_grid(tensor, nrow: int = 8, padding: int = 2, normalize: bool = False):
167
    """Make a grid of images."""
168

169
def save_image(tensor, fp, nrow: int = 8, padding: int = 2, normalize: bool = False):
170
    """Save tensor as image file."""
171

172
def draw_bounding_boxes(image, boxes, labels=None, colors=None, fill: bool = False, width: int = 1):
173
    """Draw bounding boxes on image."""
174
```
175

176
[Utils](./utils.md)
177

178
### I/O Operations
179

180
Image and video input/output operations with support for multiple formats and backends.
181

182
```python { .api }
183
def read_image(path: str, mode: str = 'RGB'):
184
    """Read image file to tensor."""
185

186
def write_jpeg(input, filename: str, quality: int = 75):
187
    """Write tensor as JPEG file."""
188

189
def read_video(filename: str, start_pts: float = 0, end_pts=None, pts_unit: str = 'pts'):
190
    """Read video file."""
191

192
class VideoReader:
193
    """Video reader for streaming video data."""
194
```
195

196
[I/O Operations](./io.md)
197

198
### Operations
199

200
Low-level operations for object detection, segmentation, and specialized neural network layers.
201

202
```python { .api }
203
def nms(boxes, scores, iou_threshold: float):
204
    """Non-maximum suppression."""
205

206
def roi_align(input, boxes, output_size, spatial_scale: float = 1.0, sampling_ratio: int = -1, aligned: bool = False):
207
    """RoI Align operation."""
208

209
def box_iou(boxes1, boxes2):
210
    """Calculate IoU between box sets."""
211

212
class FeaturePyramidNetwork(torch.nn.Module):
213
    """Feature Pyramid Network."""
214
```
215

216
[Operations](./ops.md)
217

218
### TV Tensors
219

220
Enhanced tensor types that preserve metadata and semantics through transformations, supporting images, videos, bounding boxes, masks, and keypoints.
221

222
```python { .api }
223
class Image(torch.Tensor):
224
    """Image tensor type with metadata."""
225

226
class BoundingBoxes(torch.Tensor):
227
    """Bounding box tensor with format and canvas size."""
228
    
229
class Mask(torch.Tensor):
230
    """Segmentation mask tensor type."""
231

232
class Video(torch.Tensor):
233
    """Video tensor type for temporal data."""
234
```
235

236
[TV Tensors](./tv_tensors.md)
237

238
## Version Information
239

240
```python { .api }
241
__version__: str  # TorchVision version string (0.23.0)
242
git_version: str  # Git commit hash
243
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/