or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-supervision

A set of easy-to-use utils that will come in handy in any Computer Vision project

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/supervision@0.26.x

To install, run

npx @tessl/cli install tessl/pypi-supervision@0.26.0

0

# Supervision

1

2

A comprehensive computer vision toolkit that provides model-agnostic utilities for object detection, classification, and segmentation. Supervision offers connectors for popular frameworks like Ultralytics YOLO, Transformers, and MMDetection, along with highly customizable annotation tools and dataset utilities.

3

4

## Package Information

5

6

- **Package Name**: supervision

7

- **Language**: Python

8

- **Installation**: `pip install supervision`

9

10

## Core Imports

11

12

```python

13

import supervision as sv

14

```

15

16

## Basic Usage

17

18

```python

19

import supervision as sv

20

import cv2

21

from ultralytics import YOLO

22

23

# Load image and model

24

image = cv2.imread("path/to/image.jpg")

25

model = YOLO("yolov8n.pt")

26

27

# Run inference

28

results = model(image)[0]

29

detections = sv.Detections.from_ultralytics(results)

30

31

# Annotate detections

32

box_annotator = sv.BoxAnnotator()

33

label_annotator = sv.LabelAnnotator()

34

35

# Add boxes and labels

36

annotated_frame = box_annotator.annotate(

37

scene=image, detections=detections

38

)

39

annotated_frame = label_annotator.annotate(

40

scene=annotated_frame, detections=detections

41

)

42

43

# Display result

44

cv2.imshow("Annotated Frame", annotated_frame)

45

cv2.waitKey(0)

46

```

47

48

## Architecture

49

50

Supervision is built around several core concepts:

51

52

- **Detections**: Central data structure that standardizes results from various models

53

- **Annotators**: Visualization tools for adding boxes, labels, masks, and effects

54

- **Zones**: Spatial analysis tools (LineZone, PolygonZone) for counting and tracking

55

- **Trackers**: Multi-object tracking algorithms (ByteTrack)

56

- **Utilities**: Image/video processing, format conversion, and data manipulation tools

57

58

This design enables seamless integration across different computer vision frameworks while providing consistent APIs for common tasks like annotation, tracking, and dataset management.

59

60

## Capabilities

61

62

### Core Data Structures

63

64

The fundamental data classes for representing detection, classification, and keypoint results in a standardized format.

65

66

```python { .api }

67

@dataclass

68

class Detections:

69

xyxy: np.ndarray

70

mask: np.ndarray | None = None

71

confidence: np.ndarray | None = None

72

class_id: np.ndarray | None = None

73

tracker_id: np.ndarray | None = None

74

data: dict[str, np.ndarray | list] = field(default_factory=dict)

75

metadata: dict[str, Any] = field(default_factory=dict)

76

77

@dataclass

78

class Classifications:

79

class_id: np.ndarray

80

confidence: np.ndarray | None = None

81

82

class KeyPoints: ...

83

```

84

85

[Core Data Structures](./core-data-structures.md)

86

87

### Annotators

88

89

Comprehensive visualization tools for adding boxes, labels, masks, shapes, and visual effects to images. Includes basic shapes, text labels, tracking trails, and advanced effects.

90

91

```python { .api }

92

class BoxAnnotator: ...

93

class LabelAnnotator: ...

94

class MaskAnnotator: ...

95

class CircleAnnotator: ...

96

class DotAnnotator: ...

97

class TraceAnnotator: ...

98

```

99

100

[Annotators](./annotators.md)

101

102

### Detection Tools

103

104

Spatial analysis tools for counting, tracking, and analyzing objects within defined zones and regions.

105

106

```python { .api }

107

class LineZone: ...

108

class PolygonZone: ...

109

class InferenceSlicer: ...

110

class DetectionsSmoother: ...

111

```

112

113

[Detection Tools](./detection-tools.md)

114

115

### Coordinate and Data Conversion

116

117

Utilities for converting between different coordinate formats, processing masks and polygons, and transforming data between various computer vision frameworks.

118

119

```python { .api }

120

def xyxy_to_xywh(xyxy: np.ndarray) -> np.ndarray: ...

121

def mask_to_polygons(mask: np.ndarray) -> list[np.ndarray]: ...

122

def polygon_to_mask(polygon: np.ndarray, resolution_wh: tuple[int, int]) -> np.ndarray: ...

123

```

124

125

[Coordinate Conversion](./coordinate-conversion.md)

126

127

### IOU and NMS Operations

128

129

Intersection over Union calculations and Non-Maximum Suppression algorithms for boxes, masks, and oriented boxes.

130

131

```python { .api }

132

def box_iou(boxes_true: np.ndarray, boxes_detection: np.ndarray) -> np.ndarray: ...

133

def box_non_max_suppression(predictions: np.ndarray, iou_threshold: float = 0.5) -> np.ndarray: ...

134

```

135

136

[IOU and NMS](./iou-nms.md)

137

138

### Dataset Management

139

140

Tools for loading, processing, and converting datasets between popular formats like COCO, YOLO, and Pascal VOC.

141

142

```python { .api }

143

class BaseDataset: ...

144

class DetectionDataset: ...

145

class ClassificationDataset: ...

146

```

147

148

[Dataset Management](./dataset-management.md)

149

150

### Drawing and Colors

151

152

Low-level drawing utilities and color management for creating custom visualizations and annotations.

153

154

```python { .api }

155

class Color: ...

156

class ColorPalette: ...

157

def draw_polygon(image: np.ndarray, polygon: np.ndarray, color: Color) -> np.ndarray: ...

158

```

159

160

[Drawing and Colors](./drawing-colors.md)

161

162

### Video and Image Processing

163

164

Utilities for processing video streams, handling image operations, and managing video input/output.

165

166

```python { .api }

167

class VideoInfo: ...

168

class VideoSink: ...

169

def process_video(source_path: str, target_path: str, callback: callable) -> None: ...

170

```

171

172

[Video Processing](./video-processing.md)

173

174

### Tracking

175

176

Multi-object tracking algorithms for maintaining object identities across video frames.

177

178

```python { .api }

179

class ByteTrack: ...

180

```

181

182

[Tracking](./tracking.md)

183

184

### Metrics and Evaluation

185

186

Tools for evaluating model performance including confusion matrices and mean average precision calculations.

187

188

```python { .api }

189

class ConfusionMatrix: ...

190

class MeanAveragePrecision: ...

191

```

192

193

[Metrics](./metrics.md)

194

195

### Vision-Language Model Integration

196

197

Support for integrating various vision-language models for zero-shot object detection and image analysis tasks.

198

199

```python { .api }

200

class VLM(Enum):

201

PALIGEMMA = "paligemma"

202

FLORENCE_2 = "florence_2"

203

QWEN_2_5_VL = "qwen_2_5_vl"

204

GOOGLE_GEMINI_2_0 = "gemini_2_0"

205

GOOGLE_GEMINI_2_5 = "gemini_2_5"

206

MOONDREAM = "moondream"

207

208

class LMM(Enum): # Deprecated, use VLM

209

PALIGEMMA = "paligemma"

210

FLORENCE_2 = "florence_2"

211

QWEN_2_5_VL = "qwen_2_5_vl"

212

GOOGLE_GEMINI_2_0 = "gemini_2_0"

213

GOOGLE_GEMINI_2_5 = "gemini_2_5"

214

MOONDREAM = "moondream"

215

```

216

217

[VLM Support](./vlm-support.md)

218

219

### File Utilities

220

221

Utilities for working with files, directories, and different file formats including JSON, YAML, and text files.

222

223

```python { .api }

224

def list_files_with_extensions(directory: str | Path, extensions: list[str] | None = None) -> list[Path]: ...

225

def read_json_file(file_path: str | Path) -> dict: ...

226

def save_json_file(data: dict, file_path: str | Path, indent: int = 3) -> None: ...

227

def read_yaml_file(file_path: str | Path) -> dict: ...

228

def save_yaml_file(data: dict, file_path: str | Path) -> None: ...

229

```

230

231

[File Utilities](./file-utilities.md)

232

233

### Keypoint Annotators

234

235

Specialized annotators for drawing keypoints, skeletal connections, and pose estimation results.

236

237

```python { .api }

238

class VertexAnnotator: ...

239

class EdgeAnnotator: ...

240

class VertexLabelAnnotator: ...

241

```

242

243

[Keypoint Annotators](./keypoint-annotators.md)

244

245

### Conversion Utilities

246

247

Utilities for converting between different image formats and computer vision libraries.

248

249

```python { .api }

250

def cv2_to_pillow(opencv_image: np.ndarray) -> Image.Image: ...

251

def pillow_to_cv2(pillow_image: Image.Image) -> np.ndarray: ...

252

```

253

254

### Notebook Utilities

255

256

Utilities for displaying images and visualizations in Jupyter notebooks.

257

258

```python { .api }

259

def plot_image(image: np.ndarray, title: str | None = None) -> None: ...

260

def plot_images_grid(images: list[np.ndarray], grid_size: tuple[int, int] | None = None) -> None: ...

261

```

262

263

## Types

264

265

```python { .api }

266

# Geometry types

267

@dataclass

268

class Point:

269

x: float

270

y: float

271

272

@dataclass

273

class Rect:

274

x: float

275

y: float

276

width: float

277

height: float

278

279

class Position(Enum):

280

CENTER = "CENTER"

281

TOP_LEFT = "TOP_LEFT"

282

TOP_RIGHT = "TOP_RIGHT"

283

BOTTOM_LEFT = "BOTTOM_LEFT"

284

BOTTOM_RIGHT = "BOTTOM_RIGHT"

285

# Additional position values...

286

287

# Enums

288

class ColorLookup(Enum):

289

CLASS = "CLASS"

290

TRACK = "TRACK"

291

INDEX = "INDEX"

292

293

class OverlapMetric(Enum):

294

IOU = "iou"

295

IOS = "ios"

296

```