or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-sahi

A vision library for performing sliced inference on large images/small objects

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/sahi@0.11.x

To install, run

npx @tessl/cli install tessl/pypi-sahi@0.11.0

0

# SAHI - Slicing Aided Hyper Inference

1

2

A comprehensive computer vision library specifically designed for performing large-scale object detection and instance segmentation on high-resolution images. SAHI addresses the challenge of detecting small objects in large images by implementing sliced inference techniques that divide large images into smaller, overlapping patches for processing.

3

4

## Package Information

5

6

- **Package Name**: sahi

7

- **Language**: Python

8

- **Installation**: `pip install sahi`

9

10

## Core Imports

11

12

```python

13

import sahi

14

from sahi import AutoDetectionModel

15

```

16

17

Core classes and functions:

18

19

```python

20

from sahi import (

21

BoundingBox,

22

Category,

23

Mask,

24

AutoDetectionModel,

25

DetectionModel,

26

ObjectPrediction

27

)

28

```

29

30

Prediction functions:

31

32

```python

33

from sahi.predict import get_prediction, get_sliced_prediction, predict

34

```

35

36

## Basic Usage

37

38

```python

39

from sahi import AutoDetectionModel

40

from sahi.predict import get_sliced_prediction

41

42

# Load a detection model

43

detection_model = AutoDetectionModel.from_pretrained(

44

model_type='ultralytics',

45

model_path='yolov8n.pt',

46

confidence_threshold=0.3,

47

device="cpu"

48

)

49

50

# Perform sliced inference on a large image

51

result = get_sliced_prediction(

52

image="path/to/large_image.jpg",

53

detection_model=detection_model,

54

slice_height=512,

55

slice_width=512,

56

overlap_height_ratio=0.2,

57

overlap_width_ratio=0.2

58

)

59

60

# Access predictions

61

predictions = result.object_prediction_list

62

for prediction in predictions:

63

print(f"Class: {prediction.category.name}")

64

print(f"Confidence: {prediction.score.value}")

65

print(f"BBox: {prediction.bbox.to_coco_bbox()}")

66

67

# Export visualization

68

result.export_visuals(export_dir="output/")

69

```

70

71

## Architecture

72

73

SAHI's architecture centers around three key concepts:

74

75

- **Detection Models**: Unified interface for various deep learning frameworks (YOLO, MMDetection, Detectron2, HuggingFace, etc.)

76

- **Sliced Inference**: Automatic image slicing with overlapping patches to handle large images and small objects

77

- **Annotation Framework**: Comprehensive data structures for bounding boxes, masks, and predictions with format conversions

78

79

The library seamlessly integrates with popular frameworks while providing consistent APIs for slicing, prediction postprocessing, dataset operations, and visualization across research and production environments.

80

81

## Capabilities

82

83

### Model Integration and Loading

84

85

Unified interface for loading detection models from various frameworks including Ultralytics YOLO, MMDetection, Detectron2, HuggingFace Transformers, TorchVision, and Roboflow.

86

87

```python { .api }

88

class AutoDetectionModel:

89

@staticmethod

90

def from_pretrained(

91

model_type: str,

92

model_path: Optional[str] = None,

93

model: Optional[Any] = None,

94

config_path: Optional[str] = None,

95

device: Optional[str] = None,

96

mask_threshold: float = 0.5,

97

confidence_threshold: float = 0.3,

98

category_mapping: Optional[Dict] = None,

99

category_remapping: Optional[Dict] = None,

100

load_at_init: bool = True,

101

image_size: Optional[int] = None,

102

**kwargs,

103

) -> DetectionModel: ...

104

```

105

106

[Model Integration](./model-integration.md)

107

108

### Core Prediction Functions

109

110

Main prediction capabilities including standard inference, sliced inference for large images, batch processing, and video processing with comprehensive parameter control.

111

112

```python { .api }

113

def get_prediction(

114

image,

115

detection_model,

116

shift_amount: list = [0, 0],

117

full_shape=None,

118

postprocess: Optional[PostprocessPredictions] = None,

119

verbose: int = 0,

120

exclude_classes_by_name: Optional[List[str]] = None,

121

exclude_classes_by_id: Optional[List[int]] = None,

122

) -> PredictionResult: ...

123

124

def get_sliced_prediction(

125

image,

126

detection_model,

127

slice_height: Optional[int] = None,

128

slice_width: Optional[int] = None,

129

overlap_height_ratio: float = 0.2,

130

overlap_width_ratio: float = 0.2,

131

perform_standard_pred: bool = True,

132

postprocess_type: str = "GREEDYNMM",

133

postprocess_match_metric: str = "IOS",

134

postprocess_match_threshold: float = 0.5,

135

postprocess_class_agnostic: bool = False,

136

verbose: int = 1,

137

merge_buffer_length: Optional[int] = None,

138

auto_slice_resolution: bool = True,

139

slice_export_prefix: Optional[str] = None,

140

slice_dir: Optional[str] = None,

141

exclude_classes_by_name: Optional[List[str]] = None,

142

exclude_classes_by_id: Optional[List[int]] = None,

143

) -> PredictionResult: ...

144

145

def predict(

146

detection_model: Optional[DetectionModel] = None,

147

model_type: str = "ultralytics",

148

model_path: Optional[str] = None,

149

model_config_path: Optional[str] = None,

150

model_confidence_threshold: float = 0.25,

151

model_device: Optional[str] = None,

152

model_category_mapping: Optional[dict] = None,

153

model_category_remapping: Optional[dict] = None,

154

source: Optional[str] = None,

155

no_standard_prediction: bool = False,

156

no_sliced_prediction: bool = False,

157

image_size: Optional[int] = None,

158

slice_height: int = 512,

159

slice_width: int = 512,

160

overlap_height_ratio: float = 0.2,

161

overlap_width_ratio: float = 0.2,

162

postprocess_type: str = "GREEDYNMM",

163

postprocess_match_metric: str = "IOS",

164

postprocess_match_threshold: float = 0.5,

165

postprocess_class_agnostic: bool = False,

166

novisual: bool = False,

167

view_video: bool = False,

168

frame_skip_interval: int = 0,

169

export_pickle: bool = False,

170

export_crop: bool = False,

171

dataset_json_path: Optional[str] = None,

172

project: str = "runs/predict",

173

name: str = "exp",

174

visual_bbox_thickness: Optional[int] = None,

175

visual_text_size: Optional[float] = None,

176

visual_text_thickness: Optional[int] = None,

177

visual_hide_labels: bool = False,

178

visual_hide_conf: bool = False,

179

visual_export_format: str = "png",

180

verbose: int = 1,

181

return_dict: bool = False,

182

force_postprocess_type: bool = False,

183

exclude_classes_by_name: Optional[List[str]] = None,

184

exclude_classes_by_id: Optional[List[int]] = None,

185

**kwargs,

186

) -> Optional[Dict]: ...

187

```

188

189

[Prediction Functions](./prediction-functions.md)

190

191

### Annotation and Data Structures

192

193

Core data structures for handling bounding boxes, masks, categories, and complete object annotations with comprehensive format conversion and manipulation methods.

194

195

```python { .api }

196

@dataclass(frozen=True)

197

class BoundingBox:

198

box: Union[Tuple[float, float, float, float], List[float]]

199

shift_amount: Tuple[int, int] = (0, 0)

200

201

def get_expanded_box(self, ratio: float = 0.1, max_x: int = None, max_y: int = None) -> "BoundingBox": ...

202

def to_coco_bbox(self) -> List[float]: ...

203

def to_xyxy(self) -> List[float]: ...

204

def get_shifted_box(self) -> "BoundingBox": ...

205

206

@dataclass(frozen=True)

207

class Category:

208

id: Optional[Union[int, str]] = None

209

name: Optional[str] = None

210

211

class Mask:

212

def __init__(self, bool_mask: Optional[np.ndarray] = None, segmentation: Optional[List] = None, shift_amount: Tuple[int, int] = (0, 0)): ...

213

@classmethod

214

def from_float_mask(cls, mask: np.ndarray, mask_threshold: float = 0.5, shift_amount: Tuple[int, int] = (0, 0)) -> "Mask": ...

215

@classmethod

216

def from_bool_mask(cls, mask: np.ndarray, shift_amount: Tuple[int, int] = (0, 0)) -> "Mask": ...

217

def get_shifted_mask(self) -> "Mask": ...

218

219

class ObjectPrediction(ObjectAnnotation):

220

def __init__(

221

self,

222

bbox: Optional[BoundingBox] = None,

223

category: Optional[Category] = None,

224

score: Optional[PredictionScore] = None,

225

mask: Optional[Mask] = None,

226

shift_amount: Optional[List[int]] = None,

227

full_shape: Optional[List[int]] = None,

228

): ...

229

def get_shifted_object_prediction(self) -> "ObjectPrediction": ...

230

def to_coco_prediction(self) -> CocoPrediction: ...

231

def to_fiftyone_detection(self): ...

232

```

233

234

[Annotation Framework](./annotation-framework.md)

235

236

### Image Slicing and Processing

237

238

Advanced image slicing capabilities for handling large images, including automatic parameter calculation, annotation processing, and dataset slicing operations.

239

240

```python { .api }

241

def get_slice_bboxes(

242

image_height: int,

243

image_width: int,

244

slice_height: Optional[int] = None,

245

slice_width: Optional[int] = None,

246

auto_slice_resolution: Optional[bool] = True,

247

overlap_height_ratio: Optional[float] = 0.2,

248

overlap_width_ratio: Optional[float] = 0.2,

249

) -> List[List[int]]: ...

250

251

def slice_image(

252

image: Union[str, Image.Image],

253

output_file_name: Optional[str] = None,

254

output_dir: Optional[str] = None,

255

slice_height: int = 512,

256

slice_width: int = 512,

257

overlap_height_ratio: float = 0.2,

258

overlap_width_ratio: float = 0.2,

259

auto_slice_resolution: bool = True,

260

min_area_ratio: float = 0.1,

261

out_ext: Optional[str] = None,

262

verbose: bool = False,

263

) -> SliceImageResult: ...

264

265

def slice_coco(

266

coco_annotation_file_path: str,

267

image_dir: str,

268

output_coco_annotation_file_name: str = "",

269

output_dir: Optional[str] = None,

270

ignore_negative_samples: bool = False,

271

slice_height: int = 512,

272

slice_width: int = 512,

273

overlap_height_ratio: float = 0.2,

274

overlap_width_ratio: float = 0.2,

275

min_area_ratio: float = 0.1,

276

verbose: bool = False,

277

) -> str: ...

278

```

279

280

[Image Slicing](./image-slicing.md)

281

282

### Postprocessing and NMS

283

284

Advanced postprocessing methods for combining overlapping predictions including Non-Maximum Suppression (NMS), Non-Maximum Merging (NMM), and specialized algorithms for sliced inference results.

285

286

```python { .api }

287

class PostprocessPredictions:

288

def __init__(

289

self,

290

match_threshold: float = 0.5,

291

match_metric: str = "IOS",

292

class_agnostic: bool = False,

293

): ...

294

def __call__(

295

self,

296

object_predictions: List[ObjectPrediction],

297

) -> List[ObjectPrediction]: ...

298

299

class NMSPostprocess(PostprocessPredictions): ...

300

class NMMPostprocess(PostprocessPredictions): ...

301

class GreedyNMMPostprocess(PostprocessPredictions): ...

302

class LSNMSPostprocess(PostprocessPredictions): ...

303

304

def nms(

305

predictions: np.ndarray,

306

match_threshold: float = 0.5,

307

class_agnostic: bool = False,

308

) -> List[int]: ...

309

310

def greedy_nmm(

311

predictions: np.ndarray,

312

match_threshold: float = 0.5,

313

class_agnostic: bool = False,

314

) -> List[int]: ...

315

```

316

317

[Postprocessing](./postprocessing.md)

318

319

### COCO Dataset Integration

320

321

Comprehensive COCO dataset handling including loading, manipulation, annotation processing, evaluation, and format conversion capabilities.

322

323

```python { .api }

324

class Coco:

325

def __init__(self, coco_path: Optional[str] = None): ...

326

def add_image(self, coco_image: CocoImage) -> int: ...

327

def add_annotation(self, coco_annotation: CocoAnnotation) -> int: ...

328

def add_category(self, coco_category: CocoCategory) -> int: ...

329

def merge(self, coco2: "Coco") -> "Coco": ...

330

def export_as_yolo(

331

self,

332

output_dir: str,

333

train_split_rate: float = 1.0,

334

numpy_seed: int = 0,

335

mp: bool = True,

336

): ...

337

338

class CocoImage:

339

def __init__(self, image_path: str, image_id: Optional[int] = None): ...

340

def add_annotation(self, annotation: CocoAnnotation): ...

341

342

class CocoAnnotation:

343

def __init__(

344

self,

345

bbox: Optional[List[int]] = None,

346

category_id: Optional[int] = None,

347

category_name: Optional[str] = None,

348

iscrowd: int = 0,

349

area: Optional[int] = None,

350

segmentation: Optional[List] = None,

351

image_id: Optional[int] = None,

352

annotation_id: Optional[int] = None,

353

): ...

354

355

def create_coco_dict() -> Dict: ...

356

def export_coco_as_yolo(

357

coco_path: str,

358

output_dir: str,

359

train_split_rate: float = 1.0,

360

numpy_seed: int = 0,

361

) -> str: ...

362

```

363

364

[COCO Integration](./coco-integration.md)

365

366

### Command Line Interface

367

368

Complete command-line interface for prediction, dataset processing, evaluation, and format conversion operations accessible through the `sahi` command.

369

370

```bash { .api }

371

# Main prediction command

372

sahi predict --model_type ultralytics --model_path yolov8n.pt --source image.jpg

373

374

# Prediction with FiftyOne integration

375

sahi predict-fiftyone --model_type ultralytics --model_path yolov8n.pt --source image.jpg

376

377

# COCO dataset operations

378

sahi coco slice --image_dir images/ --dataset_json_path dataset.json

379

sahi coco evaluate --dataset_json_path dataset.json --result_json_path results.json

380

sahi coco yolo --coco_annotation_file_path dataset.json --image_dir images/

381

sahi coco analyse --dataset_json_path dataset.json --result_json_path results.json

382

sahi coco fiftyone --coco_annotation_file_path dataset.json --image_dir images/

383

384

# Environment and version info

385

sahi version

386

sahi env

387

```

388

389

[Command Line Interface](./cli.md)

390

391

### Utilities and Framework Integration

392

393

Utility functions for computer vision operations, framework-specific integrations, file I/O operations, and compatibility across different deep learning ecosystems.

394

395

```python { .api }

396

# CV utilities

397

def read_image_as_pil(image: Union[Image.Image, str, np.ndarray], exif_fix: bool = True) -> Image.Image: ...

398

def read_image(image_path: str) -> np.ndarray: ...

399

def visualize_object_predictions(

400

image: np.ndarray,

401

object_prediction_list: List[ObjectPrediction],

402

rect_th: int = 3,

403

text_size: float = 3,

404

text_th: float = 3,

405

color: tuple = None,

406

hide_labels: bool = False,

407

hide_conf: bool = False,

408

output_dir: Optional[str] = None,

409

file_name: Optional[str] = "prediction_visual",

410

) -> np.ndarray: ...

411

def crop_object_predictions(

412

image: np.ndarray,

413

object_prediction_list: List[ObjectPrediction],

414

output_dir: str,

415

file_name: str,

416

export_format: str = "PNG",

417

) -> None: ...

418

def get_video_reader(video_path: str): ...

419

420

# File utilities

421

def save_json(data, save_path: str, indent: Optional[int] = None): ...

422

def load_json(load_path: str, encoding: str = "utf-8") -> Dict: ...

423

def save_pickle(data: Any, save_path: str): ...

424

def load_pickle(load_path: str) -> Any: ...

425

def list_files(

426

directory: str,

427

contains: List[str] = None,

428

verbose: bool = True,

429

max_depth: Optional[int] = None,

430

) -> List[str]: ...

431

def get_base_filename(path: str) -> str: ...

432

def get_file_extension(path: str) -> str: ...

433

def download_from_url(from_url: str, to_path: str): ...

434

435

# Import utilities

436

def is_available(package: str) -> bool: ...

437

def check_requirements(requirements: List[str], raise_exception: bool = True): ...

438

```

439

440

[Utilities](./utilities.md)

441

442

## Types

443

444

```python { .api }

445

class PredictionResult:

446

def __init__(

447

self,

448

object_prediction_list: List[ObjectPrediction],

449

image: Image.Image,

450

durations_in_seconds: Optional[Dict] = None,

451

): ...

452

def export_visuals(self, export_dir: str, text_size: float = None): ...

453

def to_coco_annotations(self) -> List[CocoAnnotation]: ...

454

def to_coco_predictions(self) -> List[CocoPrediction]: ...

455

456

class PredictionScore:

457

def __init__(self, value: Union[float, np.ndarray]): ...

458

def is_greater_than_threshold(self, threshold: float) -> bool: ...

459

460

class SliceImageResult:

461

def __init__(self, original_image_size: List[int], image_dir: str): ...

462

463

class SlicedImage:

464

def __init__(self, image: Image.Image, coco_image: CocoImage, starting_pixel: List[int]): ...

465

```