A set of easy-to-use utils that will come in handy in any Computer Vision project
npx @tessl/cli install tessl/pypi-supervision@0.26.00
# Supervision
1
2
A comprehensive computer vision toolkit that provides model-agnostic utilities for object detection, classification, and segmentation. Supervision offers connectors for popular frameworks like Ultralytics YOLO, Transformers, and MMDetection, along with highly customizable annotation tools and dataset utilities.
3
4
## Package Information
5
6
- **Package Name**: supervision
7
- **Language**: Python
8
- **Installation**: `pip install supervision`
9
10
## Core Imports
11
12
```python
13
import supervision as sv
14
```
15
16
## Basic Usage
17
18
```python
19
import supervision as sv
20
import cv2
21
from ultralytics import YOLO
22
23
# Load image and model
24
image = cv2.imread("path/to/image.jpg")
25
model = YOLO("yolov8n.pt")
26
27
# Run inference
28
results = model(image)[0]
29
detections = sv.Detections.from_ultralytics(results)
30
31
# Annotate detections
32
box_annotator = sv.BoxAnnotator()
33
label_annotator = sv.LabelAnnotator()
34
35
# Add boxes and labels
36
annotated_frame = box_annotator.annotate(
37
scene=image, detections=detections
38
)
39
annotated_frame = label_annotator.annotate(
40
scene=annotated_frame, detections=detections
41
)
42
43
# Display result
44
cv2.imshow("Annotated Frame", annotated_frame)
45
cv2.waitKey(0)
46
```
47
48
## Architecture
49
50
Supervision is built around several core concepts:
51
52
- **Detections**: Central data structure that standardizes results from various models
53
- **Annotators**: Visualization tools for adding boxes, labels, masks, and effects
54
- **Zones**: Spatial analysis tools (LineZone, PolygonZone) for counting and tracking
55
- **Trackers**: Multi-object tracking algorithms (ByteTrack)
56
- **Utilities**: Image/video processing, format conversion, and data manipulation tools
57
58
This design enables seamless integration across different computer vision frameworks while providing consistent APIs for common tasks like annotation, tracking, and dataset management.
59
60
## Capabilities
61
62
### Core Data Structures
63
64
The fundamental data classes for representing detection, classification, and keypoint results in a standardized format.
65
66
```python { .api }
67
@dataclass
68
class Detections:
69
xyxy: np.ndarray
70
mask: np.ndarray | None = None
71
confidence: np.ndarray | None = None
72
class_id: np.ndarray | None = None
73
tracker_id: np.ndarray | None = None
74
data: dict[str, np.ndarray | list] = field(default_factory=dict)
75
metadata: dict[str, Any] = field(default_factory=dict)
76
77
@dataclass
78
class Classifications:
79
class_id: np.ndarray
80
confidence: np.ndarray | None = None
81
82
class KeyPoints: ...
83
```
84
85
[Core Data Structures](./core-data-structures.md)
86
87
### Annotators
88
89
Comprehensive visualization tools for adding boxes, labels, masks, shapes, and visual effects to images. Includes basic shapes, text labels, tracking trails, and advanced effects.
90
91
```python { .api }
92
class BoxAnnotator: ...
93
class LabelAnnotator: ...
94
class MaskAnnotator: ...
95
class CircleAnnotator: ...
96
class DotAnnotator: ...
97
class TraceAnnotator: ...
98
```
99
100
[Annotators](./annotators.md)
101
102
### Detection Tools
103
104
Spatial analysis tools for counting, tracking, and analyzing objects within defined zones and regions.
105
106
```python { .api }
107
class LineZone: ...
108
class PolygonZone: ...
109
class InferenceSlicer: ...
110
class DetectionsSmoother: ...
111
```
112
113
[Detection Tools](./detection-tools.md)
114
115
### Coordinate and Data Conversion
116
117
Utilities for converting between different coordinate formats, processing masks and polygons, and transforming data between various computer vision frameworks.
118
119
```python { .api }
120
def xyxy_to_xywh(xyxy: np.ndarray) -> np.ndarray: ...
121
def mask_to_polygons(mask: np.ndarray) -> list[np.ndarray]: ...
122
def polygon_to_mask(polygon: np.ndarray, resolution_wh: tuple[int, int]) -> np.ndarray: ...
123
```
124
125
[Coordinate Conversion](./coordinate-conversion.md)
126
127
### IOU and NMS Operations
128
129
Intersection over Union calculations and Non-Maximum Suppression algorithms for boxes, masks, and oriented boxes.
130
131
```python { .api }
132
def box_iou(boxes_true: np.ndarray, boxes_detection: np.ndarray) -> np.ndarray: ...
133
def box_non_max_suppression(predictions: np.ndarray, iou_threshold: float = 0.5) -> np.ndarray: ...
134
```
135
136
[IOU and NMS](./iou-nms.md)
137
138
### Dataset Management
139
140
Tools for loading, processing, and converting datasets between popular formats like COCO, YOLO, and Pascal VOC.
141
142
```python { .api }
143
class BaseDataset: ...
144
class DetectionDataset: ...
145
class ClassificationDataset: ...
146
```
147
148
[Dataset Management](./dataset-management.md)
149
150
### Drawing and Colors
151
152
Low-level drawing utilities and color management for creating custom visualizations and annotations.
153
154
```python { .api }
155
class Color: ...
156
class ColorPalette: ...
157
def draw_polygon(image: np.ndarray, polygon: np.ndarray, color: Color) -> np.ndarray: ...
158
```
159
160
[Drawing and Colors](./drawing-colors.md)
161
162
### Video and Image Processing
163
164
Utilities for processing video streams, handling image operations, and managing video input/output.
165
166
```python { .api }
167
class VideoInfo: ...
168
class VideoSink: ...
169
def process_video(source_path: str, target_path: str, callback: callable) -> None: ...
170
```
171
172
[Video Processing](./video-processing.md)
173
174
### Tracking
175
176
Multi-object tracking algorithms for maintaining object identities across video frames.
177
178
```python { .api }
179
class ByteTrack: ...
180
```
181
182
[Tracking](./tracking.md)
183
184
### Metrics and Evaluation
185
186
Tools for evaluating model performance including confusion matrices and mean average precision calculations.
187
188
```python { .api }
189
class ConfusionMatrix: ...
190
class MeanAveragePrecision: ...
191
```
192
193
[Metrics](./metrics.md)
194
195
### Vision-Language Model Integration
196
197
Support for integrating various vision-language models for zero-shot object detection and image analysis tasks.
198
199
```python { .api }
200
class VLM(Enum):
201
PALIGEMMA = "paligemma"
202
FLORENCE_2 = "florence_2"
203
QWEN_2_5_VL = "qwen_2_5_vl"
204
GOOGLE_GEMINI_2_0 = "gemini_2_0"
205
GOOGLE_GEMINI_2_5 = "gemini_2_5"
206
MOONDREAM = "moondream"
207
208
class LMM(Enum): # Deprecated, use VLM
209
PALIGEMMA = "paligemma"
210
FLORENCE_2 = "florence_2"
211
QWEN_2_5_VL = "qwen_2_5_vl"
212
GOOGLE_GEMINI_2_0 = "gemini_2_0"
213
GOOGLE_GEMINI_2_5 = "gemini_2_5"
214
MOONDREAM = "moondream"
215
```
216
217
[VLM Support](./vlm-support.md)
218
219
### File Utilities
220
221
Utilities for working with files, directories, and different file formats including JSON, YAML, and text files.
222
223
```python { .api }
224
def list_files_with_extensions(directory: str | Path, extensions: list[str] | None = None) -> list[Path]: ...
225
def read_json_file(file_path: str | Path) -> dict: ...
226
def save_json_file(data: dict, file_path: str | Path, indent: int = 3) -> None: ...
227
def read_yaml_file(file_path: str | Path) -> dict: ...
228
def save_yaml_file(data: dict, file_path: str | Path) -> None: ...
229
```
230
231
[File Utilities](./file-utilities.md)
232
233
### Keypoint Annotators
234
235
Specialized annotators for drawing keypoints, skeletal connections, and pose estimation results.
236
237
```python { .api }
238
class VertexAnnotator: ...
239
class EdgeAnnotator: ...
240
class VertexLabelAnnotator: ...
241
```
242
243
[Keypoint Annotators](./keypoint-annotators.md)
244
245
### Conversion Utilities
246
247
Utilities for converting between different image formats and computer vision libraries.
248
249
```python { .api }
250
def cv2_to_pillow(opencv_image: np.ndarray) -> Image.Image: ...
251
def pillow_to_cv2(pillow_image: Image.Image) -> np.ndarray: ...
252
```
253
254
### Notebook Utilities
255
256
Utilities for displaying images and visualizations in Jupyter notebooks.
257
258
```python { .api }
259
def plot_image(image: np.ndarray, title: str | None = None) -> None: ...
260
def plot_images_grid(images: list[np.ndarray], grid_size: tuple[int, int] | None = None) -> None: ...
261
```
262
263
## Types
264
265
```python { .api }
266
# Geometry types
267
@dataclass
268
class Point:
269
x: float
270
y: float
271
272
@dataclass
273
class Rect:
274
x: float
275
y: float
276
width: float
277
height: float
278
279
class Position(Enum):
280
CENTER = "CENTER"
281
TOP_LEFT = "TOP_LEFT"
282
TOP_RIGHT = "TOP_RIGHT"
283
BOTTOM_LEFT = "BOTTOM_LEFT"
284
BOTTOM_RIGHT = "BOTTOM_RIGHT"
285
# Additional position values...
286
287
# Enums
288
class ColorLookup(Enum):
289
CLASS = "CLASS"
290
TRACK = "TRACK"
291
INDEX = "INDEX"
292
293
class OverlapMetric(Enum):
294
IOU = "iou"
295
IOS = "ios"
296
```