Tessl Tile for pypi/sahi@0.11.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

annotation-framework.md cli.md coco-integration.md image-slicing.md index.md model-integration.md postprocessing.md prediction-functions.md utilities.md

index.mddocs/

0
# SAHI - Slicing Aided Hyper Inference
1

2
A comprehensive computer vision library specifically designed for performing large-scale object detection and instance segmentation on high-resolution images. SAHI addresses the challenge of detecting small objects in large images by implementing sliced inference techniques that divide large images into smaller, overlapping patches for processing.
3

4
## Package Information
5

6
- **Package Name**: sahi
7
- **Language**: Python
8
- **Installation**: `pip install sahi`
9

10
## Core Imports
11

12
```python
13
import sahi
14
from sahi import AutoDetectionModel
15
```
16

17
Core classes and functions:
18

19
```python
20
from sahi import (
21
    BoundingBox,
22
    Category, 
23
    Mask,
24
    AutoDetectionModel,
25
    DetectionModel,
26
    ObjectPrediction
27
)
28
```
29

30
Prediction functions:
31

32
```python
33
from sahi.predict import get_prediction, get_sliced_prediction, predict
34
```
35

36
## Basic Usage
37

38
```python
39
from sahi import AutoDetectionModel
40
from sahi.predict import get_sliced_prediction
41

42
# Load a detection model
43
detection_model = AutoDetectionModel.from_pretrained(
44
    model_type='ultralytics',
45
    model_path='yolov8n.pt',
46
    confidence_threshold=0.3,
47
    device="cpu"
48
)
49

50
# Perform sliced inference on a large image
51
result = get_sliced_prediction(
52
    image="path/to/large_image.jpg",
53
    detection_model=detection_model,
54
    slice_height=512,
55
    slice_width=512,
56
    overlap_height_ratio=0.2,
57
    overlap_width_ratio=0.2
58
)
59

60
# Access predictions
61
predictions = result.object_prediction_list
62
for prediction in predictions:
63
    print(f"Class: {prediction.category.name}")
64
    print(f"Confidence: {prediction.score.value}")
65
    print(f"BBox: {prediction.bbox.to_coco_bbox()}")
66

67
# Export visualization
68
result.export_visuals(export_dir="output/")
69
```
70

71
## Architecture  
72

73
SAHI's architecture centers around three key concepts:
74

75
- **Detection Models**: Unified interface for various deep learning frameworks (YOLO, MMDetection, Detectron2, HuggingFace, etc.)
76
- **Sliced Inference**: Automatic image slicing with overlapping patches to handle large images and small objects  
77
- **Annotation Framework**: Comprehensive data structures for bounding boxes, masks, and predictions with format conversions
78

79
The library seamlessly integrates with popular frameworks while providing consistent APIs for slicing, prediction postprocessing, dataset operations, and visualization across research and production environments.
80

81
## Capabilities
82

83
### Model Integration and Loading
84

85
Unified interface for loading detection models from various frameworks including Ultralytics YOLO, MMDetection, Detectron2, HuggingFace Transformers, TorchVision, and Roboflow.
86

87
```python { .api }
88
class AutoDetectionModel:
89
    @staticmethod
90
    def from_pretrained(
91
        model_type: str,
92
        model_path: Optional[str] = None,
93
        model: Optional[Any] = None,
94
        config_path: Optional[str] = None,
95
        device: Optional[str] = None,
96
        mask_threshold: float = 0.5,
97
        confidence_threshold: float = 0.3,
98
        category_mapping: Optional[Dict] = None,
99
        category_remapping: Optional[Dict] = None,
100
        load_at_init: bool = True,
101
        image_size: Optional[int] = None,
102
        **kwargs,
103
    ) -> DetectionModel: ...
104
```
105

106
[Model Integration](./model-integration.md)
107

108
### Core Prediction Functions
109

110
Main prediction capabilities including standard inference, sliced inference for large images, batch processing, and video processing with comprehensive parameter control.
111

112
```python { .api }
113
def get_prediction(
114
    image,
115
    detection_model,
116
    shift_amount: list = [0, 0],
117
    full_shape=None,
118
    postprocess: Optional[PostprocessPredictions] = None,
119
    verbose: int = 0,
120
    exclude_classes_by_name: Optional[List[str]] = None,
121
    exclude_classes_by_id: Optional[List[int]] = None,
122
) -> PredictionResult: ...
123

124
def get_sliced_prediction(
125
    image,
126
    detection_model,
127
    slice_height: Optional[int] = None,
128
    slice_width: Optional[int] = None,
129
    overlap_height_ratio: float = 0.2,
130
    overlap_width_ratio: float = 0.2,
131
    perform_standard_pred: bool = True,
132
    postprocess_type: str = "GREEDYNMM",
133
    postprocess_match_metric: str = "IOS",
134
    postprocess_match_threshold: float = 0.5,
135
    postprocess_class_agnostic: bool = False,
136
    verbose: int = 1,
137
    merge_buffer_length: Optional[int] = None,
138
    auto_slice_resolution: bool = True,
139
    slice_export_prefix: Optional[str] = None,
140
    slice_dir: Optional[str] = None,
141
    exclude_classes_by_name: Optional[List[str]] = None,
142
    exclude_classes_by_id: Optional[List[int]] = None,
143
) -> PredictionResult: ...
144

145
def predict(
146
    detection_model: Optional[DetectionModel] = None,
147
    model_type: str = "ultralytics",
148
    model_path: Optional[str] = None,
149
    model_config_path: Optional[str] = None,
150
    model_confidence_threshold: float = 0.25,
151
    model_device: Optional[str] = None,
152
    model_category_mapping: Optional[dict] = None,
153
    model_category_remapping: Optional[dict] = None,
154
    source: Optional[str] = None,
155
    no_standard_prediction: bool = False,
156
    no_sliced_prediction: bool = False,
157
    image_size: Optional[int] = None,
158
    slice_height: int = 512,
159
    slice_width: int = 512,
160
    overlap_height_ratio: float = 0.2,
161
    overlap_width_ratio: float = 0.2,
162
    postprocess_type: str = "GREEDYNMM",
163
    postprocess_match_metric: str = "IOS",
164
    postprocess_match_threshold: float = 0.5,
165
    postprocess_class_agnostic: bool = False,
166
    novisual: bool = False,
167
    view_video: bool = False,
168
    frame_skip_interval: int = 0,
169
    export_pickle: bool = False,
170
    export_crop: bool = False,
171
    dataset_json_path: Optional[str] = None,
172
    project: str = "runs/predict",
173
    name: str = "exp",
174
    visual_bbox_thickness: Optional[int] = None,
175
    visual_text_size: Optional[float] = None,
176
    visual_text_thickness: Optional[int] = None,
177
    visual_hide_labels: bool = False,
178
    visual_hide_conf: bool = False,
179
    visual_export_format: str = "png",
180
    verbose: int = 1,
181
    return_dict: bool = False,
182
    force_postprocess_type: bool = False,
183
    exclude_classes_by_name: Optional[List[str]] = None,
184
    exclude_classes_by_id: Optional[List[int]] = None,
185
    **kwargs,
186
) -> Optional[Dict]: ...
187
```
188

189
[Prediction Functions](./prediction-functions.md)
190

191
### Annotation and Data Structures  
192

193
Core data structures for handling bounding boxes, masks, categories, and complete object annotations with comprehensive format conversion and manipulation methods.
194

195
```python { .api }
196
@dataclass(frozen=True)
197
class BoundingBox:
198
    box: Union[Tuple[float, float, float, float], List[float]]
199
    shift_amount: Tuple[int, int] = (0, 0)
200
    
201
    def get_expanded_box(self, ratio: float = 0.1, max_x: int = None, max_y: int = None) -> "BoundingBox": ...
202
    def to_coco_bbox(self) -> List[float]: ...
203
    def to_xyxy(self) -> List[float]: ...
204
    def get_shifted_box(self) -> "BoundingBox": ...
205

206
@dataclass(frozen=True)  
207
class Category:
208
    id: Optional[Union[int, str]] = None
209
    name: Optional[str] = None
210

211
class Mask:
212
    def __init__(self, bool_mask: Optional[np.ndarray] = None, segmentation: Optional[List] = None, shift_amount: Tuple[int, int] = (0, 0)): ...
213
    @classmethod
214
    def from_float_mask(cls, mask: np.ndarray, mask_threshold: float = 0.5, shift_amount: Tuple[int, int] = (0, 0)) -> "Mask": ...
215
    @classmethod  
216
    def from_bool_mask(cls, mask: np.ndarray, shift_amount: Tuple[int, int] = (0, 0)) -> "Mask": ...
217
    def get_shifted_mask(self) -> "Mask": ...
218

219
class ObjectPrediction(ObjectAnnotation):
220
    def __init__(
221
        self,
222
        bbox: Optional[BoundingBox] = None,
223
        category: Optional[Category] = None,
224
        score: Optional[PredictionScore] = None,
225
        mask: Optional[Mask] = None,
226
        shift_amount: Optional[List[int]] = None,
227
        full_shape: Optional[List[int]] = None,
228
    ): ...
229
    def get_shifted_object_prediction(self) -> "ObjectPrediction": ...
230
    def to_coco_prediction(self) -> CocoPrediction: ...
231
    def to_fiftyone_detection(self): ...
232
```
233

234
[Annotation Framework](./annotation-framework.md)
235

236
### Image Slicing and Processing
237

238
Advanced image slicing capabilities for handling large images, including automatic parameter calculation, annotation processing, and dataset slicing operations.
239

240
```python { .api }
241
def get_slice_bboxes(
242
    image_height: int,
243
    image_width: int, 
244
    slice_height: Optional[int] = None,
245
    slice_width: Optional[int] = None,
246
    auto_slice_resolution: Optional[bool] = True,
247
    overlap_height_ratio: Optional[float] = 0.2,
248
    overlap_width_ratio: Optional[float] = 0.2,
249
) -> List[List[int]]: ...
250

251
def slice_image(
252
    image: Union[str, Image.Image],
253
    output_file_name: Optional[str] = None,
254
    output_dir: Optional[str] = None,
255
    slice_height: int = 512,
256
    slice_width: int = 512,
257
    overlap_height_ratio: float = 0.2,
258
    overlap_width_ratio: float = 0.2,
259
    auto_slice_resolution: bool = True,
260
    min_area_ratio: float = 0.1,
261
    out_ext: Optional[str] = None,
262
    verbose: bool = False,
263
) -> SliceImageResult: ...
264

265
def slice_coco(
266
    coco_annotation_file_path: str,
267
    image_dir: str,
268
    output_coco_annotation_file_name: str = "",
269
    output_dir: Optional[str] = None,
270
    ignore_negative_samples: bool = False,
271
    slice_height: int = 512,
272
    slice_width: int = 512,
273
    overlap_height_ratio: float = 0.2,
274
    overlap_width_ratio: float = 0.2,
275
    min_area_ratio: float = 0.1,
276
    verbose: bool = False,
277
) -> str: ...
278
```
279

280
[Image Slicing](./image-slicing.md)
281

282
### Postprocessing and NMS
283

284
Advanced postprocessing methods for combining overlapping predictions including Non-Maximum Suppression (NMS), Non-Maximum Merging (NMM), and specialized algorithms for sliced inference results.
285

286
```python { .api }
287
class PostprocessPredictions:
288
    def __init__(
289
        self,
290
        match_threshold: float = 0.5,
291
        match_metric: str = "IOS", 
292
        class_agnostic: bool = False,
293
    ): ...
294
    def __call__(
295
        self,
296
        object_predictions: List[ObjectPrediction],
297
    ) -> List[ObjectPrediction]: ...
298

299
class NMSPostprocess(PostprocessPredictions): ...
300
class NMMPostprocess(PostprocessPredictions): ...  
301
class GreedyNMMPostprocess(PostprocessPredictions): ...
302
class LSNMSPostprocess(PostprocessPredictions): ...
303

304
def nms(
305
    predictions: np.ndarray,
306
    match_threshold: float = 0.5,
307
    class_agnostic: bool = False,
308
) -> List[int]: ...
309

310
def greedy_nmm(
311
    predictions: np.ndarray,
312
    match_threshold: float = 0.5,
313
    class_agnostic: bool = False,
314
) -> List[int]: ...
315
```
316

317
[Postprocessing](./postprocessing.md)
318

319
### COCO Dataset Integration
320

321
Comprehensive COCO dataset handling including loading, manipulation, annotation processing, evaluation, and format conversion capabilities.
322

323
```python { .api }
324
class Coco:
325
    def __init__(self, coco_path: Optional[str] = None): ...
326
    def add_image(self, coco_image: CocoImage) -> int: ...
327
    def add_annotation(self, coco_annotation: CocoAnnotation) -> int: ...
328
    def add_category(self, coco_category: CocoCategory) -> int: ...
329
    def merge(self, coco2: "Coco") -> "Coco": ...
330
    def export_as_yolo(
331
        self, 
332
        output_dir: str,
333
        train_split_rate: float = 1.0,
334
        numpy_seed: int = 0,
335
        mp: bool = True,
336
    ): ...
337

338
class CocoImage:
339
    def __init__(self, image_path: str, image_id: Optional[int] = None): ...
340
    def add_annotation(self, annotation: CocoAnnotation): ...
341

342
class CocoAnnotation:
343
    def __init__(
344
        self,
345
        bbox: Optional[List[int]] = None,
346
        category_id: Optional[int] = None,
347
        category_name: Optional[str] = None,
348
        iscrowd: int = 0,
349
        area: Optional[int] = None,
350
        segmentation: Optional[List] = None,
351
        image_id: Optional[int] = None,
352
        annotation_id: Optional[int] = None,
353
    ): ...
354

355
def create_coco_dict() -> Dict: ...
356
def export_coco_as_yolo(
357
    coco_path: str,
358
    output_dir: str, 
359
    train_split_rate: float = 1.0,
360
    numpy_seed: int = 0,
361
) -> str: ...
362
```
363

364
[COCO Integration](./coco-integration.md)
365

366
### Command Line Interface
367

368
Complete command-line interface for prediction, dataset processing, evaluation, and format conversion operations accessible through the `sahi` command.
369

370
```bash { .api }
371
# Main prediction command
372
sahi predict --model_type ultralytics --model_path yolov8n.pt --source image.jpg
373

374
# Prediction with FiftyOne integration
375
sahi predict-fiftyone --model_type ultralytics --model_path yolov8n.pt --source image.jpg
376

377
# COCO dataset operations  
378
sahi coco slice --image_dir images/ --dataset_json_path dataset.json
379
sahi coco evaluate --dataset_json_path dataset.json --result_json_path results.json
380
sahi coco yolo --coco_annotation_file_path dataset.json --image_dir images/
381
sahi coco analyse --dataset_json_path dataset.json --result_json_path results.json
382
sahi coco fiftyone --coco_annotation_file_path dataset.json --image_dir images/
383

384
# Environment and version info
385
sahi version
386
sahi env
387
```
388

389
[Command Line Interface](./cli.md)
390

391
### Utilities and Framework Integration
392

393
Utility functions for computer vision operations, framework-specific integrations, file I/O operations, and compatibility across different deep learning ecosystems.
394

395
```python { .api }
396
# CV utilities
397
def read_image_as_pil(image: Union[Image.Image, str, np.ndarray], exif_fix: bool = True) -> Image.Image: ...
398
def read_image(image_path: str) -> np.ndarray: ...
399
def visualize_object_predictions(
400
    image: np.ndarray,
401
    object_prediction_list: List[ObjectPrediction],
402
    rect_th: int = 3,
403
    text_size: float = 3,
404
    text_th: float = 3,
405
    color: tuple = None,
406
    hide_labels: bool = False,
407
    hide_conf: bool = False,
408
    output_dir: Optional[str] = None,
409
    file_name: Optional[str] = "prediction_visual",
410
) -> np.ndarray: ...
411
def crop_object_predictions(
412
    image: np.ndarray,
413
    object_prediction_list: List[ObjectPrediction],
414
    output_dir: str,
415
    file_name: str,
416
    export_format: str = "PNG",
417
) -> None: ...
418
def get_video_reader(video_path: str): ...
419

420
# File utilities  
421
def save_json(data, save_path: str, indent: Optional[int] = None): ...
422
def load_json(load_path: str, encoding: str = "utf-8") -> Dict: ...
423
def save_pickle(data: Any, save_path: str): ...
424
def load_pickle(load_path: str) -> Any: ...
425
def list_files(
426
    directory: str,
427
    contains: List[str] = None,
428
    verbose: bool = True,
429
    max_depth: Optional[int] = None,
430
) -> List[str]: ...
431
def get_base_filename(path: str) -> str: ...
432
def get_file_extension(path: str) -> str: ...
433
def download_from_url(from_url: str, to_path: str): ...
434

435
# Import utilities
436
def is_available(package: str) -> bool: ...
437
def check_requirements(requirements: List[str], raise_exception: bool = True): ...
438
```
439

440
[Utilities](./utilities.md)
441

442
## Types
443

444
```python { .api }
445
class PredictionResult:
446
    def __init__(
447
        self,
448
        object_prediction_list: List[ObjectPrediction],
449
        image: Image.Image, 
450
        durations_in_seconds: Optional[Dict] = None,
451
    ): ...
452
    def export_visuals(self, export_dir: str, text_size: float = None): ...
453
    def to_coco_annotations(self) -> List[CocoAnnotation]: ...
454
    def to_coco_predictions(self) -> List[CocoPrediction]: ...
455

456
class PredictionScore:
457
    def __init__(self, value: Union[float, np.ndarray]): ...
458
    def is_greater_than_threshold(self, threshold: float) -> bool: ...
459

460
class SliceImageResult:
461
    def __init__(self, original_image_size: List[int], image_dir: str): ...
462

463
class SlicedImage:
464
    def __init__(self, image: Image.Image, coco_image: CocoImage, starting_pixel: List[int]): ...
465
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/