CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-simpleitk

SimpleITK is a simplified interface to the Insight Toolkit (ITK) for image registration and segmentation

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

performance-patterns.mddocs/examples/

Performance Patterns

Optimization techniques and performance best practices for SimpleITK.

Threading Configuration

Global Thread Settings

import SimpleITK as sitk

# Auto-detect optimal thread count
sitk.ProcessObject.SetGlobalDefaultNumberOfThreads(0)

# Or set explicitly
sitk.ProcessObject.SetGlobalDefaultNumberOfThreads(8)

# Check current setting
num_threads = sitk.ProcessObject.GetGlobalDefaultNumberOfThreads()
print(f"Using {num_threads} threads")

Per-Filter Threading

# Override global setting for specific filter
filter = sitk.DiscreteGaussianImageFilter()
filter.SetNumberOfThreads(4)  # Use 4 threads regardless of global setting
result = filter.Execute(image)

Memory Optimization

Use Views for Read-Only Operations

import SimpleITK as sitk
import numpy as np

# INEFFICIENT: Creates copy
image = sitk.ReadImage('large_volume.nii.gz')
array = sitk.GetArrayFromImage(image)  # Copies entire image
mean = array.mean()

# EFFICIENT: No copy
image = sitk.ReadImage('large_volume.nii.gz')
array_view = sitk.GetArrayViewFromImage(image)  # No copy
mean = array_view.mean()

Slice-by-Slice Processing

def memory_efficient_processing(volume_path):
    """Process large volume slice by slice."""
    
    volume = sitk.ReadImage(volume_path)
    array = sitk.GetArrayFromImage(volume)
    
    # Process each slice independently
    for z in range(array.shape[0]):
        slice_2d = array[z, :, :]
        # Process slice
        array[z, :, :] = process_slice(slice_2d)
    
    result = sitk.GetImageFromArray(array)
    result.CopyInformation(volume)
    return result

In-Place Operations

# In-place operations avoid copies
image = sitk.ReadImage('input.nii')

# These modify image in-place (when possible)
image += 10
image *= 1.5
image &= mask

# Equivalent to:
# image = sitk.Add(image, 10)
# image = sitk.Multiply(image, 1.5)
# image = sitk.And(image, mask)

Algorithm Selection

Gaussian Smoothing

import SimpleITK as sitk

# For small sigma: DiscreteGaussian is faster
image = sitk.ReadImage('input.nii')
smoothed = sitk.DiscreteGaussian(image, variance=1.0)

# For large sigma: SmoothingRecursiveGaussian is faster
smoothed = sitk.SmoothingRecursiveGaussian(image, sigma=[5.0, 5.0, 5.0])

Convolution

# For small kernels: Use standard convolution
small_kernel = sitk.Image([5, 5, 5], sitk.sitkFloat32)
result = sitk.Convolution(image, small_kernel)

# For large kernels: Use FFT convolution
large_kernel = sitk.Image([31, 31, 31], sitk.sitkFloat32)
result = sitk.FFTConvolution(image, large_kernel)

Data Type Optimization

Use Appropriate Types

import SimpleITK as sitk

# INEFFICIENT: Using Float64 when Float32 suffices
image_f64 = sitk.Cast(image, sitk.sitkFloat64)  # 8 bytes per pixel

# EFFICIENT: Use Float32
image_f32 = sitk.Cast(image, sitk.sitkFloat32)  # 4 bytes per pixel

# For binary masks: Use UInt8
mask_f32 = sitk.BinaryThreshold(image, ...)  # Returns Float32
mask_u8 = sitk.Cast(mask_f32, sitk.sitkUInt8)  # 1 byte per pixel

Minimize Type Conversions

# INEFFICIENT: Multiple conversions
image = sitk.ReadImage('input.png')  # UInt8
image = sitk.Cast(image, sitk.sitkFloat32)
result = sitk.SmoothingRecursiveGaussian(image, sigma=[1.0, 1.0])
result = sitk.Cast(result, sitk.sitkUInt8)
result = sitk.Cast(result, sitk.sitkFloat32)  # Unnecessary!

# EFFICIENT: Plan type conversions
image = sitk.ReadImage('input.png')
image = sitk.Cast(image, sitk.sitkFloat32)
result = sitk.SmoothingRecursiveGaussian(image, sigma=[1.0, 1.0])
# Keep as Float32 if more processing needed

Registration Performance

Multi-Resolution Strategy

def fast_registration(fixed, moving):
    """Fast registration using multi-resolution pyramid."""
    
    registration = sitk.ImageRegistrationMethod()
    
    # Use aggressive downsampling
    registration.SetShrinkFactorsPerLevel([8, 4, 2, 1])
    registration.SetSmoothingSigmasPerLevel([4.0, 2.0, 1.0, 0.0])
    
    # Sample only 1% of pixels
    registration.SetMetricSamplingPercentage(0.01)
    
    # Use fast interpolator
    registration.SetInterpolator(sitk.sitkLinear)  # Faster than BSpline
    
    # Rest of configuration...
    return registration.Execute(fixed, moving)

Sampling Strategy

# SLOW: Use all pixels
registration.SetMetricSamplingStrategy(registration.NONE)

# FAST: Random sampling
registration.SetMetricSamplingStrategy(registration.RANDOM)
registration.SetMetricSamplingPercentage(0.01)  # 1% of pixels

# BALANCED: Regular sampling
registration.SetMetricSamplingStrategy(registration.REGULAR)
registration.SetMetricSamplingPercentage(0.05)  # 5% of pixels

Filter Optimization

Kernel Size Selection

# Morphological operations: smaller kernels are faster
# SLOW: Large kernel
result = sitk.BinaryErode(image, kernelRadius=[10, 10, 10])

# FAST: Iterative small kernels (often equivalent)
result = image
for _ in range(5):
    result = sitk.BinaryErode(result, kernelRadius=[2, 2, 2])

Avoid Redundant Operations

# INEFFICIENT: Recompute statistics multiple times
stats1 = sitk.StatisticsImageFilter()
stats1.Execute(image)
mean = stats1.GetMean()

stats2 = sitk.StatisticsImageFilter()
stats2.Execute(image)
std = stats2.GetSigma()

# EFFICIENT: Compute once
stats = sitk.StatisticsImageFilter()
stats.Execute(image)
mean = stats.GetMean()
std = stats.GetSigma()
variance = stats.GetVariance()
min_val = stats.GetMinimum()
max_val = stats.GetMaximum()

Parallel Processing

Process Multiple Images in Parallel

import SimpleITK as sitk
from concurrent.futures import ProcessPoolExecutor
import os

def process_single_image(input_path, output_path):
    """Process single image (will run in separate process)."""
    image = sitk.ReadImage(input_path)
    result = sitk.SmoothingRecursiveGaussian(image, sigma=[1.0, 1.0, 1.0])
    sitk.WriteImage(result, output_path)
    return output_path

def parallel_batch_processing(input_dir, output_dir, max_workers=4):
    """Process multiple images in parallel."""
    
    os.makedirs(output_dir, exist_ok=True)
    
    # Get all input files
    input_files = [
        f for f in os.listdir(input_dir)
        if f.endswith(('.nii', '.nii.gz', '.mha'))
    ]
    
    # Create task list
    tasks = [
        (os.path.join(input_dir, f), os.path.join(output_dir, f))
        for f in input_files
    ]
    
    # Process in parallel
    with ProcessPoolExecutor(max_workers=max_workers) as executor:
        futures = [
            executor.submit(process_single_image, inp, out)
            for inp, out in tasks
        ]
        
        for future in futures:
            result = future.result()
            print(f"Completed: {result}")

Caching and Reuse

Cache Computed Results

import SimpleITK as sitk

class ImageProcessor:
    """Processor with result caching."""
    
    def __init__(self):
        self._gradient_cache = {}
    
    def get_gradient(self, image):
        """Get gradient with caching."""
        
        # Use hash as cache key
        image_hash = sitk.Hash(image)
        
        if image_hash not in self._gradient_cache:
            print("Computing gradient...")
            gradient = sitk.GradientMagnitude(image)
            self._gradient_cache[image_hash] = gradient
        else:
            print("Using cached gradient")
        
        return self._gradient_cache[image_hash]

Profiling and Benchmarking

Measure Filter Performance

import SimpleITK as sitk
import time

def benchmark_filter(filter_func, image, *args, **kwargs):
    """Benchmark filter execution time."""
    
    # Warm-up run
    _ = filter_func(image, *args, **kwargs)
    
    # Timed runs
    times = []
    for _ in range(5):
        start = time.time()
        result = filter_func(image, *args, **kwargs)
        elapsed = time.time() - start
        times.append(elapsed)
    
    avg_time = sum(times) / len(times)
    print(f"Average time: {avg_time:.3f}s")
    
    return result, avg_time

# Example
image = sitk.ReadImage('test.nii')
result, time_taken = benchmark_filter(
    sitk.SmoothingRecursiveGaussian,
    image,
    sigma=[2.0, 2.0, 2.0]
)

Compare Algorithm Performance

import SimpleITK as sitk
import time

def compare_smoothing_methods(image):
    """Compare performance of different smoothing methods."""
    
    methods = [
        ('RecursiveGaussian', lambda img: sitk.SmoothingRecursiveGaussian(img, sigma=[2.0]*3)),
        ('DiscreteGaussian', lambda img: sitk.DiscreteGaussian(img, variance=4.0)),
        ('Median', lambda img: sitk.Median(img, radius=[2]*3)),
        ('Bilateral', lambda img: sitk.Bilateral(img, domainSigma=2.0, rangeSigma=50.0))
    ]
    
    results = {}
    for name, method in methods:
        start = time.time()
        result = method(image)
        elapsed = time.time() - start
        results[name] = elapsed
        print(f"{name}: {elapsed:.3f}s")
    
    return results

Memory Profiling

Monitor Memory Usage

import SimpleITK as sitk
import psutil
import os

def get_memory_usage():
    """Get current process memory usage in MB."""
    process = psutil.Process(os.getpid())
    return process.memory_info().rss / 1024 / 1024

def profile_memory(operation, *args, **kwargs):
    """Profile memory usage of operation."""
    
    mem_before = get_memory_usage()
    print(f"Memory before: {mem_before:.1f} MB")
    
    result = operation(*args, **kwargs)
    
    mem_after = get_memory_usage()
    print(f"Memory after: {mem_after:.1f} MB")
    print(f"Memory delta: {mem_after - mem_before:.1f} MB")
    
    return result

# Example
image = sitk.ReadImage('large.nii')
smoothed = profile_memory(
    sitk.SmoothingRecursiveGaussian,
    image,
    sigma=[2.0, 2.0, 2.0]
)

Best Practices Summary

DO:

  • ✅ Use GetArrayViewFromImage() for read-only operations
  • ✅ Use SmoothingRecursiveGaussian() for large sigma
  • ✅ Use FFT convolution for large kernels
  • ✅ Use multi-resolution for registration
  • ✅ Sample pixels for registration metrics
  • ✅ Use appropriate data types (UInt8 vs Float32)
  • ✅ Process large images in chunks
  • ✅ Set thread count based on hardware
  • ✅ Cache expensive computations

DON'T:

  • ❌ Use GetArrayFromImage() for read-only operations
  • ❌ Use DiscreteGaussian() for large sigma
  • ❌ Process entire large images at once
  • ❌ Use Float64 when Float32 suffices
  • ❌ Use all pixels for registration metrics
  • ❌ Recompute the same results multiple times
  • ❌ Use single-resolution for large images
  • ❌ Forget to configure threading

Performance Checklist

Before Processing

  • Check image size and memory requirements
  • Select appropriate data type
  • Configure threading for hardware
  • Plan memory usage (copy vs view)

During Processing

  • Use efficient algorithms for task
  • Monitor progress for long operations
  • Process in chunks if memory-constrained
  • Cache reusable results

After Processing

  • Clean up temporary arrays
  • Verify results before saving
  • Use compression for output files

Benchmarking Results

Typical Performance (256³ volume, 8 threads)

OperationTimeMemory
Read NIfTI0.1s128 MB
Gaussian σ=1.00.2s256 MB
Gaussian σ=5.00.5s256 MB
Median r=22.0s256 MB
Bilateral5.0s512 MB
Rigid Registration10s512 MB
Affine Registration20s512 MB
Demons (50 iter)30s1 GB

Note: Times are approximate and hardware-dependent

See Also

  • Quick Start Guide - Basic usage
  • Real-World Scenarios - Complete examples
  • Edge Cases - Handling special cases
  • Performance Reference - Detailed performance documentation

Install with Tessl CLI

npx tessl i tessl/pypi-simpleitk

docs

examples

edge-cases.md

performance-patterns.md

real-world-scenarios.md

index.md

tile.json