0
# Model Inference
1
2
Comprehensive inference capabilities across different computer vision tasks, supporting both image and video inputs with various specialized model types. The inference system provides both hosted and self-hosted model execution.
3
4
## Capabilities
5
6
### Base Inference Model
7
8
The foundation class for all model inference operations.
9
10
```python { .api }
11
class InferenceModel:
12
def __init__(self, api_key, version_id, colors=None, *args, **kwargs):
13
"""
14
Create an InferenceModel for running predictions.
15
16
Parameters:
17
- api_key: str - Roboflow API key
18
- version_id: str - Model version identifier in format "workspace/project/version"
19
- colors: dict, optional - Custom color mapping for visualizations
20
"""
21
22
# Properties
23
id: str # Model version identifier
24
dataset_id: str # Associated dataset identifier
25
version: str # Version number
26
colors: dict # Color mapping for classes
27
```
28
29
### Image Inference
30
31
Run inference on individual images with various prediction types.
32
33
```python { .api }
34
def predict(self, image_path, prediction_type=None, **kwargs):
35
"""
36
Run inference on a single image.
37
38
Parameters:
39
- image_path: str or numpy.ndarray - Path to image file or numpy array
40
- prediction_type: str, optional - Override default prediction type
41
- **kwargs: Additional parameters for specific model types:
42
- confidence: float - Minimum confidence threshold (0.0-1.0)
43
- overlap_threshold: float - NMS IoU threshold for object detection
44
- stroke_width: int - Visualization stroke width
45
- labels: bool - Whether to include labels in output
46
- format: str - Output format ("json", "image")
47
48
Returns:
49
dict - Prediction results with bounding boxes, classes, confidences
50
"""
51
```
52
53
### Video Inference
54
55
Process video files with frame-by-frame inference.
56
57
```python { .api }
58
def predict_video(self, video_path, fps=1, prediction_type=None, **kwargs):
59
"""
60
Run inference on video file.
61
62
Parameters:
63
- video_path: str - Path to video file
64
- fps: int - Frames per second to process (default: 1)
65
- prediction_type: str, optional - Override default prediction type
66
- **kwargs: Additional inference parameters
67
68
Returns:
69
dict - Job information for polling results
70
"""
71
72
def poll_for_video_results(self, job_id: Optional[str] = None):
73
"""
74
Check status of video inference job.
75
76
Parameters:
77
- job_id: str, optional - Job ID to check (uses last job if not provided)
78
79
Returns:
80
dict - Job status and results if complete
81
"""
82
83
def poll_until_video_results(self, job_id):
84
"""
85
Wait for video inference job to complete.
86
87
Parameters:
88
- job_id: str - Job ID to wait for
89
90
Returns:
91
dict - Final results when job completes
92
"""
93
```
94
95
### Model Download
96
97
Download trained model weights for local inference.
98
99
```python { .api }
100
def download(self, format="pt", location="."):
101
"""
102
Download model weights for local inference.
103
104
Parameters:
105
- format: str - Model format ("pt", "onnx", "tflite", "coreml")
106
- location: str - Download directory (default: current directory)
107
108
Returns:
109
str - Path to downloaded model file
110
"""
111
```
112
113
## Specialized Model Classes
114
115
### Object Detection
116
117
Specialized model for detecting and localizing objects in images.
118
119
```python { .api }
120
class ObjectDetectionModel(InferenceModel):
121
"""Object detection model with bounding box predictions."""
122
```
123
124
**Prediction Output Format:**
125
```python
126
{
127
"predictions": [
128
{
129
"x": 320.0, # Center X coordinate
130
"y": 240.0, # Center Y coordinate
131
"width": 100.0, # Bounding box width
132
"height": 80.0, # Bounding box height
133
"confidence": 0.85, # Prediction confidence
134
"class": "person", # Predicted class name
135
"class_id": 0 # Class index
136
}
137
],
138
"image": {
139
"width": 640,
140
"height": 480
141
}
142
}
143
```
144
145
### Classification
146
147
Model for image classification tasks.
148
149
```python { .api }
150
class ClassificationModel(InferenceModel):
151
"""Image classification model with class predictions."""
152
```
153
154
**Prediction Output Format:**
155
```python
156
{
157
"predictions": [
158
{
159
"class": "cat",
160
"confidence": 0.92
161
},
162
{
163
"class": "dog",
164
"confidence": 0.08
165
}
166
],
167
"top": "cat" # Top prediction class
168
}
169
```
170
171
### Instance Segmentation
172
173
Model for pixel-level object segmentation.
174
175
```python { .api }
176
class InstanceSegmentationModel(InferenceModel):
177
"""Instance segmentation model with masks and bounding boxes."""
178
```
179
180
**Prediction Output Format:**
181
```python
182
{
183
"predictions": [
184
{
185
"x": 320.0,
186
"y": 240.0,
187
"width": 100.0,
188
"height": 80.0,
189
"confidence": 0.85,
190
"class": "person",
191
"class_id": 0,
192
"points": [ # Polygon points for mask
193
{"x": 275, "y": 200},
194
{"x": 365, "y": 200},
195
{"x": 365, "y": 280},
196
{"x": 275, "y": 280}
197
]
198
}
199
]
200
}
201
```
202
203
### Semantic Segmentation
204
205
Model for pixel-level semantic classification.
206
207
```python { .api }
208
class SemanticSegmentationModel(InferenceModel):
209
"""Semantic segmentation model with pixel-wise classifications."""
210
```
211
212
### Keypoint Detection
213
214
Model for detecting keypoints and pose estimation.
215
216
```python { .api }
217
class KeypointDetectionModel(InferenceModel):
218
"""Keypoint detection model for pose estimation."""
219
```
220
221
**Prediction Output Format:**
222
```python
223
{
224
"predictions": [
225
{
226
"x": 320.0,
227
"y": 240.0,
228
"width": 100.0,
229
"height": 120.0,
230
"confidence": 0.90,
231
"class": "person",
232
"keypoints": [
233
{"x": 315, "y": 210, "confidence": 0.95, "name": "nose"},
234
{"x": 310, "y": 215, "confidence": 0.88, "name": "left_eye"},
235
{"x": 320, "y": 215, "confidence": 0.91, "name": "right_eye"}
236
# ... additional keypoints
237
]
238
}
239
]
240
}
241
```
242
243
### CLIP Model
244
245
CLIP-based model for image embedding and comparison.
246
247
```python { .api }
248
class CLIPModel(InferenceModel):
249
"""CLIP model for image embeddings and similarity."""
250
```
251
252
### Gaze Detection
253
254
Model for detecting gaze direction and attention.
255
256
```python { .api }
257
class GazeModel(InferenceModel):
258
"""Gaze detection model for eye tracking applications."""
259
```
260
261
### Video Inference
262
263
Specialized model for video-based inference tasks.
264
265
```python { .api }
266
class VideoInferenceModel(InferenceModel):
267
"""Video inference model for temporal analysis."""
268
```
269
270
## Usage Examples
271
272
### Basic Image Inference
273
274
```python
275
import roboflow
276
277
# Load model from trained version
278
rf = roboflow.Roboflow(api_key="your_api_key")
279
project = rf.workspace().project("my-project")
280
version = project.version(1)
281
model = version.model
282
283
# Run inference
284
prediction = model.predict("path/to/image.jpg")
285
print(f"Found {len(prediction['predictions'])} objects")
286
287
# With custom parameters
288
prediction = model.predict(
289
"image.jpg",
290
confidence=0.7,
291
overlap_threshold=0.3
292
)
293
```
294
295
### Specialized Model Usage
296
297
```python
298
# Object detection with specific model
299
from roboflow.models.object_detection import ObjectDetectionModel
300
301
model = ObjectDetectionModel(
302
api_key="your_api_key",
303
version_id="workspace/project/1"
304
)
305
306
prediction = model.predict("image.jpg", confidence=0.5)
307
for obj in prediction['predictions']:
308
print(f"Found {obj['class']} with {obj['confidence']:.2f} confidence")
309
310
# Classification model
311
from roboflow.models.classification import ClassificationModel
312
313
classifier = ClassificationModel(
314
api_key="your_api_key",
315
version_id="workspace/classifier-project/1"
316
)
317
318
result = classifier.predict("image.jpg")
319
print(f"Classified as: {result['top']}")
320
```
321
322
### Video Processing
323
324
```python
325
# Start video inference
326
job = model.predict_video("video.mp4", fps=2)
327
job_id = job['id']
328
329
# Poll for results
330
import time
331
while True:
332
status = model.poll_for_video_results(job_id)
333
if status['status'] == 'complete':
334
print("Video processing complete!")
335
results = status['results']
336
break
337
elif status['status'] == 'failed':
338
print("Video processing failed")
339
break
340
341
time.sleep(10) # Wait 10 seconds before checking again
342
343
# Or wait until complete
344
results = model.poll_until_video_results(job_id)
345
```
346
347
### Model Download and Local Inference
348
349
```python
350
# Download model weights
351
model_path = model.download(format="pt", location="./models")
352
print(f"Model downloaded to: {model_path}")
353
354
# Download in different formats
355
onnx_path = model.download(format="onnx", location="./models")
356
tflite_path = model.download(format="tflite", location="./models")
357
```
358
359
### Batch Processing
360
361
```python
362
import os
363
364
# Process multiple images
365
image_dir = "/path/to/images"
366
results = []
367
368
for filename in os.listdir(image_dir):
369
if filename.lower().endswith(('.png', '.jpg', '.jpeg')):
370
image_path = os.path.join(image_dir, filename)
371
prediction = model.predict(image_path)
372
results.append({
373
'filename': filename,
374
'prediction': prediction
375
})
376
377
print(f"Processed {len(results)} images")
378
```
379
380
## Error Handling
381
382
Inference operations can raise various exceptions:
383
384
```python
385
try:
386
prediction = model.predict("nonexistent.jpg")
387
except FileNotFoundError:
388
print("Image file not found")
389
except Exception as e:
390
print(f"Inference failed: {e}")
391
392
# Handle video processing errors
393
try:
394
job = model.predict_video("large_video.mp4")
395
results = model.poll_until_video_results(job['id'])
396
except RuntimeError as e:
397
print(f"Video processing failed: {e}")
398
```
399
400
## Performance Considerations
401
402
- **Batch Processing**: Process multiple images in batches for efficiency
403
- **Video FPS**: Lower FPS values reduce processing time and cost
404
- **Confidence Thresholds**: Higher thresholds reduce false positives but may miss objects
405
- **Image Size**: Larger images provide more detail but increase processing time
406
- **Local Models**: Downloaded models enable offline inference but require local compute resources