CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-simpleitk

SimpleITK is a simplified interface to the Insight Toolkit (ITK) for image registration and segmentation

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

performance.mddocs/reference/

Performance Reference

Complete performance optimization guide for SimpleITK.

Threading

Global Configuration

def SetGlobalDefaultNumberOfThreads(n: int):
    """
    Set default thread count for all filters.
    
    Args:
        n: Number of threads (0 = auto-detect from hardware)
    """

def GetGlobalDefaultNumberOfThreads() -> int:
    """Get current global thread count."""

def SetGlobalDefaultThreader(threader: str):
    """
    Set threading backend.
    
    Args:
        threader: 'POOL', 'TBB', or 'PLATFORM'
    """

def GetGlobalDefaultThreader() -> str:
    """Get current threading backend."""

Per-Filter Threading

class ProcessObject:
    def SetNumberOfThreads(self, threads: int) -> None:
        """Override global thread count for this filter."""
    
    def GetNumberOfThreads(self) -> int:
        """Get thread count for this filter."""

Threading Best Practices

  • Auto-detect: Set to 0 for automatic hardware detection
  • Hyper-threading: Use physical cores, not logical (typically cores * 1)
  • Memory-bound: Reduce threads if memory-limited
  • I/O-bound: Threading provides minimal benefit

Memory Optimization

Array Conversion Strategies

# Read-only: Use view (no copy)
array_view = sitk.GetArrayViewFromImage(image)
mean = array_view.mean()  # Fast, no memory overhead

# Modification: Use copy
array = sitk.GetArrayFromImage(image)
array *= 2.0  # Safe, independent copy

Memory Footprint Estimation

def estimate_memory_mb(image):
    """Estimate image memory usage in MB."""
    
    num_pixels = image.GetNumberOfPixels()
    bytes_per_pixel = image.GetSizeOfPixelComponent()
    num_components = image.GetNumberOfComponentsPerPixel()
    
    bytes_total = num_pixels * bytes_per_pixel * num_components
    mb = bytes_total / (1024 * 1024)
    
    return mb

# Example
image = sitk.ReadImage('volume.nii')
memory_mb = estimate_memory_mb(image)
print(f"Image requires ~{memory_mb:.1f} MB")

Chunked Processing

def process_in_chunks(image, chunk_size=10):
    """Process image in chunks to reduce memory."""
    
    array = sitk.GetArrayFromImage(image)
    depth = array.shape[0]
    
    for z_start in range(0, depth, chunk_size):
        z_end = min(z_start + chunk_size, depth)
        chunk = array[z_start:z_end, :, :]
        
        # Process chunk
        chunk_processed = chunk * 2.0
        array[z_start:z_end, :, :] = chunk_processed
    
    result = sitk.GetImageFromArray(array)
    result.CopyInformation(image)
    return result

Algorithm Complexity

Filter Complexity

FilterComplexityNotes
Add, Subtract, MultiplyO(n)Linear in pixels
Gaussian (recursive)O(n)Independent of sigma
Gaussian (discrete)O(n·k³)k = kernel size
MedianO(n·k³·log(k))k = radius
BilateralO(n·k³)Expensive
FFTO(n·log(n))n = total pixels
Connected ComponentsO(n·α(n))α = inverse Ackermann
Distance TransformO(n)Linear time
WatershedO(n·log(n))Priority queue
RegistrationO(iter·n·m)iter=iterations, m=metric cost

Choosing Efficient Algorithms

# Small sigma: Use discrete Gaussian
if sigma < 2.0:
    result = sitk.DiscreteGaussian(image, variance=sigma**2)
else:
    # Large sigma: Use recursive Gaussian (faster)
    result = sitk.SmoothingRecursiveGaussian(image, sigma=[sigma]*3)

# Small kernel: Direct convolution
if kernel_size < 11:
    result = sitk.Convolution(image, kernel)
else:
    # Large kernel: FFT convolution (faster)
    result = sitk.FFTConvolution(image, kernel)

Registration Performance

Sampling Strategies

# Fast: Sample 1% of pixels
registration.SetMetricSamplingStrategy(registration.RANDOM)
registration.SetMetricSamplingPercentage(0.01)

# Balanced: Sample 5%
registration.SetMetricSamplingPercentage(0.05)

# Accurate: Sample 20%
registration.SetMetricSamplingPercentage(0.20)

# Slow: Use all pixels
registration.SetMetricSamplingStrategy(registration.NONE)

Multi-Resolution Speedup

# Single resolution: Slow
registration.SetShrinkFactorsPerLevel([1])
registration.SetSmoothingSigmasPerLevel([0.0])

# Multi-resolution: Fast (typically 5-10x faster)
registration.SetShrinkFactorsPerLevel([8, 4, 2, 1])
registration.SetSmoothingSigmasPerLevel([4.0, 2.0, 1.0, 0.0])

Optimizer Selection

OptimizerSpeedAccuracyUse Case
Gradient DescentFastGoodGeneral purpose
L-BFGS-BMediumExcellentMany parameters
PowellSlowGoodDerivative-free
AmoebaVery SlowFairFew parameters
ExhaustiveExtremely SlowExactInitialization

I/O Performance

Compression Trade-offs

# No compression: Fast write, large files
sitk.WriteImage(image, 'output.nii', useCompression=False)

# Light compression: Balanced
sitk.WriteImage(image, 'output.nii.gz', useCompression=True, compressionLevel=1)

# Heavy compression: Slow write, small files
sitk.WriteImage(image, 'output.nii.gz', useCompression=True, compressionLevel=9)

Format Performance

FormatRead SpeedWrite SpeedFile SizeNotes
MetaImage (.mha)FastFastLargeUncompressed
NIfTI (.nii)FastFastLargeUncompressed
NIfTI (.nii.gz)MediumSlowSmallCompressed
NRRD (.nrrd)FastFastMediumOptional compression
PNGFastFastSmall2D only, lossy
DICOMSlowSlowMediumMetadata overhead

Profiling Tools

Built-in Progress Monitoring

class ProcessObject:
    def GetProgress(self) -> float:
        """Get execution progress (0.0 to 1.0)."""
    
    def AddCommand(self, event: int, callback):
        """Add callback for progress monitoring."""

Timing Measurements

import SimpleITK as sitk
import time

def time_operation(operation, *args, **kwargs):
    """Time an operation."""
    start = time.time()
    result = operation(*args, **kwargs)
    elapsed = time.time() - start
    print(f"Operation took {elapsed:.3f}s")
    return result, elapsed

Optimization Strategies

Strategy 1: Reduce Resolution

# Process at lower resolution, then upsample
shrunk = sitk.Shrink(image, shrinkFactors=[2, 2, 2])
processed = sitk.ExpensiveFilter(shrunk)
result = sitk.Expand(processed, expandFactors=[2, 2, 2])

Strategy 2: Region of Interest

# Process only ROI instead of entire image
roi = sitk.RegionOfInterest(image, size=(100, 100, 50), index=(50, 50, 25))
processed_roi = sitk.ExpensiveFilter(roi)
# Paste back if needed

Strategy 3: Appropriate Data Types

# Use smallest appropriate type
# UInt8: 1 byte per pixel
# UInt16: 2 bytes per pixel
# Float32: 4 bytes per pixel
# Float64: 8 bytes per pixel

# For binary masks: UInt8
mask = sitk.Cast(binary_result, sitk.sitkUInt8)

# For normalized images: Float32 (not Float64)
normalized = sitk.Cast(image, sitk.sitkFloat32)

Strategy 4: Streaming

# Some filters support streaming (process in chunks)
# Automatically used when available
# Reduces peak memory usage

Benchmarking

Standard Benchmarks

import SimpleITK as sitk
import time
import numpy as np

def benchmark_suite(image):
    """Run standard benchmark suite."""
    
    benchmarks = {
        'Read': lambda: sitk.ReadImage('test.nii'),
        'Gaussian': lambda: sitk.SmoothingRecursiveGaussian(image, sigma=[1.0]*3),
        'Median': lambda: sitk.Median(image, radius=[2]*3),
        'Threshold': lambda: sitk.BinaryThreshold(image, 100, 200),
        'Morphology': lambda: sitk.BinaryErode(image, kernelRadius=[2]*3),
        'Statistics': lambda: sitk.StatisticsImageFilter().Execute(image),
    }
    
    results = {}
    for name, operation in benchmarks.items():
        times = []
        for _ in range(3):
            start = time.time()
            operation()
            elapsed = time.time() - start
            times.append(elapsed)
        
        avg_time = sum(times) / len(times)
        results[name] = avg_time
        print(f"{name}: {avg_time:.3f}s")
    
    return results

Memory Profiling

Track Memory Usage

import SimpleITK as sitk
import tracemalloc

def profile_memory_usage(operation, *args, **kwargs):
    """Profile memory usage of operation."""
    
    tracemalloc.start()
    
    result = operation(*args, **kwargs)
    
    current, peak = tracemalloc.get_traced_memory()
    tracemalloc.stop()
    
    print(f"Current memory: {current / 1024 / 1024:.1f} MB")
    print(f"Peak memory: {peak / 1024 / 1024:.1f} MB")
    
    return result

Performance Tuning Checklist

Before Processing

  • Estimate memory requirements
  • Configure thread count
  • Select appropriate data type
  • Consider downsampling if appropriate

During Processing

  • Use efficient algorithms
  • Monitor progress for long operations
  • Process in chunks if memory-limited
  • Cache reusable computations

After Processing

  • Profile bottlenecks
  • Optimize critical sections
  • Consider parallelization
  • Validate performance improvements

Hardware Considerations

CPU

  • More cores = better parallelization
  • Cache size affects filter performance
  • SIMD instructions used automatically

Memory

  • Minimum: 2x image size
  • Recommended: 4x image size
  • Registration: 8x image size

Storage

  • SSD recommended for I/O-heavy workflows
  • Compression reduces I/O time for slow storage
  • Network storage may bottleneck

See Also

  • Architecture Reference - System architecture
  • Performance Patterns - Optimization examples
  • Quick Start Guide - Getting started

Install with Tessl CLI

npx tessl i tessl/pypi-simpleitk

docs

index.md

tile.json