0
# Dataset Versions
1
2
Dataset version management including downloads, exports, training, and deployment of specific dataset versions. Versions represent snapshots of a dataset with specific preprocessing and augmentation settings applied.
3
4
## Capabilities
5
6
### Version Class
7
8
Interface for managing individual dataset versions and their associated operations.
9
10
```python { .api }
11
class Version:
12
def __init__(self, version_data, project_info, model_format, api_key, name, local=None):
13
"""
14
Initialize version object.
15
16
Parameters:
17
- version_data: dict - Version information from API
18
- project_info: dict - Parent project information
19
- model_format: str - Preferred model format
20
- api_key: str - Roboflow API key
21
- name: str - Version name/identifier
22
- local: str, optional - Local path for version data
23
"""
24
25
# Properties
26
model: InferenceModel # Associated trained model for inference
27
```
28
29
### Dataset Download
30
31
Download dataset versions in various formats for local development and training.
32
33
```python { .api }
34
def download(self, model_format=None, location=None, overwrite: bool = False):
35
"""
36
Download the dataset version in specified format.
37
38
Parameters:
39
- model_format: str, optional - Format to download ("yolov8", "yolov5", "pascal_voc", "coco", "tfrecord", "createml", "darknet", "pytorch", "tensorflow", "clip")
40
- location: str, optional - Download directory path (default: current directory)
41
- overwrite: bool - Whether to overwrite existing files (default: False)
42
43
Returns:
44
Dataset - Dataset object with location information
45
"""
46
```
47
48
### Dataset Export
49
50
Export dataset versions to different formats without downloading.
51
52
```python { .api }
53
def export(self, model_format=None):
54
"""
55
Export dataset in specified format.
56
57
Parameters:
58
- model_format: str, optional - Export format ("yolov8", "yolov5", "pascal_voc", "coco", etc.)
59
60
Returns:
61
bool or None - Export success status
62
"""
63
```
64
65
### Model Training
66
67
Train machine learning models on specific dataset versions.
68
69
```python { .api }
70
def train(self, speed=None, model_type=None, checkpoint=None, plot_in_notebook=False):
71
"""
72
Train a model on this dataset version.
73
74
Parameters:
75
- speed: str, optional - Training speed ("fast", "medium", "slow")
76
- model_type: str, optional - Model architecture ("yolov8n", "yolov8s", "yolov8m", "yolov8l", "yolov8x")
77
- checkpoint: str, optional - Checkpoint to resume training from
78
- plot_in_notebook: bool - Whether to display training plots in notebook (default: False)
79
80
Returns:
81
InferenceModel - Trained model ready for inference
82
"""
83
```
84
85
### Model Deployment
86
87
Deploy trained models to production environments.
88
89
```python { .api }
90
def deploy(self, model_type: str, model_path: str, filename: str = "weights/best.pt"):
91
"""
92
Deploy a trained model to production.
93
94
Parameters:
95
- model_type: str - Type of model being deployed
96
- model_path: str - Path to model files
97
- filename: str - Model filename within the path (default: "weights/best.pt")
98
99
Returns:
100
None - Deployment is asynchronous
101
"""
102
```
103
104
## Dataset Object
105
106
The download operation returns a Dataset object with location information.
107
108
```python { .api }
109
class Dataset:
110
location: str # Path where dataset was downloaded
111
```
112
113
## Usage Examples
114
115
### Dataset Download
116
117
```python
118
import roboflow
119
120
rf = roboflow.Roboflow(api_key="your_api_key")
121
project = rf.workspace().project("my-project")
122
version = project.version(1)
123
124
# Download in YOLO format
125
dataset = version.download("yolov8")
126
print(f"Dataset downloaded to: {dataset.location}")
127
128
# Download to specific location
129
dataset = version.download(
130
model_format="pascal_voc",
131
location="/path/to/my/datasets",
132
overwrite=True
133
)
134
135
# Download in COCO format
136
dataset = version.download("coco")
137
```
138
139
### Model Training
140
141
```python
142
# Train with default settings
143
model = version.train()
144
145
# Train with specific configuration
146
model = version.train(
147
speed="medium",
148
model_type="yolov8s",
149
plot_in_notebook=True
150
)
151
152
# Use trained model for inference
153
prediction = model.predict("/path/to/test/image.jpg")
154
```
155
156
### Dataset Export
157
158
```python
159
# Export without downloading
160
export_success = version.export("yolov8")
161
162
if export_success:
163
print("Export completed successfully")
164
```
165
166
### Model Deployment
167
168
```python
169
# Deploy a trained model
170
version.deploy(
171
model_type="yolov8",
172
model_path="/path/to/trained/model",
173
filename="best.pt"
174
)
175
```
176
177
## Supported Formats
178
179
### Download/Export Formats
180
181
The version supports multiple dataset formats:
182
183
- **YOLO Formats**: `"yolov8"`, `"yolov5"`, `"yolov7"`, `"darknet"`
184
- **Standard Formats**: `"pascal_voc"`, `"coco"`, `"tfrecord"`
185
- **Framework Formats**: `"pytorch"`, `"tensorflow"`
186
- **Mobile Formats**: `"createml"` (for iOS)
187
- **Specialized**: `"clip"` (for CLIP embeddings)
188
189
### Model Types
190
191
Training supports various YOLO model architectures:
192
193
- **YOLOv8**: `"yolov8n"`, `"yolov8s"`, `"yolov8m"`, `"yolov8l"`, `"yolov8x"`
194
- **YOLOv5**: `"yolov5n"`, `"yolov5s"`, `"yolov5m"`, `"yolov5l"`, `"yolov5x"`
195
- **Custom**: User-provided model configurations
196
197
### Training Speeds
198
199
- **"fast"**: Quick training with fewer epochs
200
- **"medium"**: Balanced training time and accuracy
201
- **"slow"**: Comprehensive training for maximum accuracy
202
203
## Version Information
204
205
Dataset versions contain metadata about:
206
207
- **Image Count**: Total images in train/valid/test splits
208
- **Class Information**: Number and names of labeled classes
209
- **Preprocessing**: Applied image preprocessing steps
210
- **Augmentation**: Data augmentation techniques used
211
- **Creation Date**: When the version was generated
212
- **Parent Version**: Source version if derived from another
213
214
## Error Handling
215
216
Version operations can encounter various errors:
217
218
```python
219
try:
220
dataset = version.download("invalid_format")
221
except ValueError as e:
222
print(f"Invalid format: {e}")
223
224
try:
225
model = version.train(model_type="invalid_model")
226
except RuntimeError as e:
227
print(f"Training failed: {e}")
228
```
229
230
## Integration with Inference
231
232
Once training is complete, models are automatically available for inference:
233
234
```python
235
# Train model
236
model = version.train()
237
238
# Immediate inference
239
prediction = model.predict("image.jpg")
240
241
# Or access model later via version.model property
242
model = version.model
243
prediction = model.predict("another_image.jpg")
244
```