CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-ultralytics

Ultralytics YOLO for state-of-the-art object detection, multi-object tracking, instance segmentation, pose estimation and image classification.

Overview
Eval results
Files

specialized-models.mddocs/

Specialized Models

Specialized model implementations for specific computer vision tasks including open-vocabulary detection, promptable segmentation, efficient segmentation, transformer-based detection, and neural architecture search.

Capabilities

YOLOWorld - Open Vocabulary Detection

YOLO-World model enables open-vocabulary object detection, allowing detection of custom classes without retraining.

class YOLOWorld:
    def __init__(self, model="yolov8s-world.pt", verbose=False):
        """
        Initialize YOLOv8-World model.
        
        Parameters:
        - model (str | Path): Path to pre-trained model file
        - verbose (bool): Enable verbose output during initialization
        """
    
    def set_classes(self, classes: List[str]):
        """
        Set custom classes for detection.
        
        Parameters:
        - classes (List[str]): List of class names to detect
        """

Usage Examples:

from ultralytics import YOLOWorld

# Initialize YOLOWorld model
model = YOLOWorld("yolov8s-world.pt")

# Set custom classes
model.set_classes(["person", "bicycle", "car", "motorcycle"])

# Perform detection with custom classes
results = model("image.jpg")

# Update classes dynamically
model.set_classes(["apple", "banana", "orange"])
results = model("fruit_image.jpg")

SAM - Segment Anything Model

Segment Anything Model provides promptable segmentation using bounding boxes, points, or text prompts.

class SAM:
    def __init__(self, model="sam_b.pt"):
        """
        Initialize SAM model.
        
        Parameters:
        - model (str | Path): Path to SAM model file
        """
    
    def predict(self, source, stream=False, bboxes=None, points=None, labels=None, **kwargs) -> List[Results]:
        """
        Perform segmentation with prompts.
        
        Parameters:
        - source: Input image source
        - stream (bool): Stream processing mode
        - bboxes (List[List[float]], optional): Bounding box prompts [[x1, y1, x2, y2], ...]
        - points (List[List[float]], optional): Point prompts [[x, y], ...]
        - labels (List[int], optional): Point labels (1 for foreground, 0 for background)
        - **kwargs: Additional arguments
        
        Returns:
        List[Results]: Segmentation results
        """
    
    def __call__(self, source=None, stream=False, bboxes=None, points=None, labels=None, **kwargs) -> List[Results]:
        """Callable interface for predict method."""

Usage Examples:

from ultralytics import SAM

# Initialize SAM model
model = SAM("sam_b.pt")  # or sam_l.pt, sam_h.pt for larger models

# Segment with bounding box prompts
results = model("image.jpg", bboxes=[[100, 100, 200, 200]])

# Segment with point prompts
results = model("image.jpg", points=[[150, 150]], labels=[1])

# Combine prompts
results = model("image.jpg", 
                bboxes=[[100, 100, 200, 200]], 
                points=[[150, 150], [180, 180]], 
                labels=[1, 0])

# Process results
for r in results:
    masks = r.masks.data  # Segmentation masks
    r.show()            # Display results

FastSAM - Fast Segment Anything

FastSAM provides efficient segmentation with significantly faster inference than SAM while maintaining competitive accuracy.

class FastSAM:
    def __init__(self, model="FastSAM-x.pt"):
        """
        Initialize FastSAM model.
        
        Parameters:
        - model (str | Path): Path to FastSAM model file
        """
    
    def predict(self, source, **kwargs) -> List[Results]:
        """
        Perform fast segmentation.
        
        Parameters:
        - source: Input image source  
        - **kwargs: Additional arguments
        
        Returns:
        List[Results]: Segmentation results
        """

Usage Examples:

from ultralytics import FastSAM

# Initialize FastSAM model
model = FastSAM("FastSAM-x.pt")  # or FastSAM-s.pt for smaller/faster model

# Perform segmentation
results = model("image.jpg")

# With custom parameters
results = model("image.jpg", conf=0.4, iou=0.9)

# Process results
for r in results:
    masks = r.masks.data  # Segmentation masks
    boxes = r.boxes.data  # Bounding boxes
    r.save("output.jpg")  # Save results

RTDETR - Real-Time Detection Transformer

Real-Time Detection Transformer provides state-of-the-art object detection using transformer architecture.

class RTDETR:
    def __init__(self, model="rtdetr-l.pt"):
        """
        Initialize RTDETR model.
        
        Parameters:
        - model (str | Path): Path to RTDETR model file
        """

Usage Examples:

from ultralytics import RTDETR

# Initialize RTDETR model
model = RTDETR("rtdetr-l.pt")  # or rtdetr-x.pt for extra large

# Perform detection
results = model("image.jpg")

# With custom parameters
results = model("image.jpg", conf=0.5, imgsz=640)

NAS - Neural Architecture Search

Neural Architecture Search models for automated architecture optimization.

class NAS:
    def __init__(self, model="yolo_nas_s.pt"):
        """
        Initialize NAS model.
        
        Parameters:
        - model (str | Path): Path to NAS model file
        """

Usage Examples:

from ultralytics import NAS

# Initialize NAS model
model = NAS("yolo_nas_s.pt")  # or yolo_nas_m.pt, yolo_nas_l.pt

# Perform detection
results = model("image.jpg")

# Training (if supported)
model.train(data="coco8.yaml", epochs=100)

Model Comparison

ModelPrimary Use CaseSpeedAccuracyPrompt Support
YOLOGeneral detection/segmentationFastHighNo
YOLOWorldOpen vocabulary detectionFastHighText classes
SAMPrecise segmentationSlowVery HighBbox/Points
FastSAMFast segmentationVery FastHighLimited
RTDETRTransformer detectionMediumVery HighNo
NASOptimized architecturesFastHighNo

Types

from typing import List, Optional, Union
from ultralytics.engine.results import Results

# Common types for specialized models
BboxPrompt = List[float]  # [x1, y1, x2, y2]
PointPrompt = List[float]  # [x, y]
ClassList = List[str]     # ["class1", "class2", ...]
LabelList = List[int]     # [1, 0, 1, ...]  (1=foreground, 0=background)

Install with Tessl CLI

npx tessl i tessl/pypi-ultralytics

docs

export-deployment.md

index.md

results-visualization.md

specialized-models.md

tracking.md

training-validation.md

utilities.md

yolo-models.md

tile.json