CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-nvtx

Python code annotation library for NVIDIA Tools Extension enabling performance profiling with NVIDIA Nsight Systems

Pending
Overview
Eval results
Files

events-ranges.mddocs/

Events and Process Ranges

Mark specific points in execution with instantaneous events and create cross-process or cross-thread ranges that can span multiple function calls. These capabilities provide fine-grained profiling control for complex applications.

Capabilities

Instantaneous Events

Use nvtx.mark to create point-in-time events that appear as vertical markers in NVIDIA Nsight Systems timeline visualization.

def mark(message: str = None, color: Union[str, int] = "blue",
         domain: str = None, category: Union[str, int] = None,
         payload: Union[int, float] = None):
    """
    Mark an instantaneous event.
    
    Parameters:
    - message: Description of the event
    - color: Color for visualization (default "blue")
    - domain: Domain name (default "NVTX")
    - category: Category name or ID for grouping
    - payload: Numeric value associated with event
    """

Usage Example:

import nvtx
import time

def training_loop():
    for epoch in range(100):
        nvtx.mark("epoch_start", color="green", payload=epoch)
        
        for batch in batches:
            train_batch(batch)
            
        # Mark significant events
        nvtx.mark("validation_checkpoint", color="yellow", payload=epoch)
        validate_model()
        
        if epoch % 10 == 0:
            nvtx.mark("model_save", color="purple", payload=epoch)
            save_model()
        
        nvtx.mark("epoch_end", color="red", payload=epoch)

Process Ranges

Use start_range and end_range to create ranges that can span across function calls, threads, or even processes. These provide more flexibility than push/pop ranges.

def start_range(message: str = None, color: Union[str, int] = None,
                domain: str = None, category: Union[str, int] = None,
                payload: Union[int, float] = None) -> Optional[Tuple[int, int]]:
    """
    Mark the beginning of a process range.
    
    Parameters:
    - message: Description of the range
    - color: Color for visualization
    - domain: Domain name (default "NVTX")
    - category: Category name or ID for grouping
    - payload: Numeric value associated with range
    
    Returns:
    - Tuple of (range_id, domain_handle) for use with end_range
    """

def end_range(range_id: Optional[Tuple[int, int]]):
    """
    Mark the end of a process range started with start_range.
    
    Parameters:
    - range_id: Tuple returned by start_range call
    """

Usage Example:

import nvtx
import time
import threading

# Cross-function ranges
def complex_pipeline():
    # Start range that spans multiple functions
    pipeline_range = nvtx.start_range("data_pipeline", color="blue")
    
    data = load_data()
    processed = process_data(data)
    results = analyze_data(processed)
    
    # End range after all processing
    nvtx.end_range(pipeline_range)
    return results

# Cross-thread ranges
class BackgroundProcessor:
    def __init__(self):
        self.active_ranges = {}
    
    def start_background_task(self, task_id):
        range_id = nvtx.start_range(f"background_task_{task_id}", 
                                   color="orange", payload=task_id)
        self.active_ranges[task_id] = range_id
        
        # Start background thread
        thread = threading.Thread(target=self._process_task, args=(task_id,))
        thread.start()
    
    def _process_task(self, task_id):
        # Long-running background work
        time.sleep(5)
        
        # End the range from background thread
        if task_id in self.active_ranges:
            nvtx.end_range(self.active_ranges[task_id])
            del self.active_ranges[task_id]

Overlapping and Nested Ranges

Process ranges support complex overlapping and nesting patterns that push/pop ranges cannot handle.

Usage Example:

import nvtx
import asyncio

async def concurrent_processing():
    # Start multiple overlapping ranges
    range1 = nvtx.start_range("async_task_1", color="blue")
    range2 = nvtx.start_range("async_task_2", color="green")
    range3 = nvtx.start_range("async_task_3", color="red")
    
    # Tasks can complete in any order
    await asyncio.gather(
        process_async_task_1(),
        process_async_task_2(), 
        process_async_task_3()
    )
    
    # End ranges as tasks complete (order may vary)
    nvtx.end_range(range2)  # Task 2 finished first
    nvtx.end_range(range1)  # Task 1 finished second
    nvtx.end_range(range3)  # Task 3 finished last

Parameter Details

Message Parameter

  • Type: Optional[str]
  • Events: Describes the specific event or milestone
  • Ranges: Describes the operation or phase
  • Caching: Messages are cached as Registered Strings for performance
  • Naming: Use consistent naming patterns for better visualization

Color Parameter

  • Type: Optional[Union[str, int]]
  • Events: Default "blue" if not specified
  • Ranges: No default, uses domain default
  • Visualization: Choose colors that create clear visual separation in timeline
  • Consistency: Use consistent colors for similar event types

Domain Parameter

  • Type: Optional[str]
  • Default: "NVTX"
  • Cross-Process: Same domain name enables correlation across processes
  • Organization: Group related events and ranges

Category Parameter

  • Type: Optional[Union[str, int]]
  • Grouping: Categories appear as separate tracks in profiler
  • Events: Useful for grouping similar event types
  • Ranges: Useful for distinguishing different types of operations

Payload Parameter

  • Type: Optional[Union[int, float]]
  • Events: Event-specific data (iteration number, size, count)
  • Ranges: Range-specific metrics (expected duration, priority, size)
  • Analysis: Enables data-driven performance analysis

Return Values

start_range Return Value

  • Type: Optional[Tuple[int, int]]
  • Contents: (range_id, domain_handle) when NVTX is enabled, None when disabled
  • Usage: Must be passed exactly to corresponding end_range call
  • Storage: Can be stored in variables, data structures, or passed between functions/threads
  • Lifetime: Valid until end_range is called

Error Handling

  • Invalid Range ID: end_range with invalid or already-ended range ID is safely ignored
  • Missing end_range: Unclosed ranges may appear as ongoing in profiler visualization
  • Thread Safety: Range IDs can be safely passed between threads
  • Exception Safety: Range IDs remain valid even if exceptions occur

Performance Considerations

  • Domain API: Use Domain.start_range and Domain.end_range for better performance
  • Range Tracking: Store range IDs efficiently for high-frequency operations
  • Cleanup: Always call end_range to avoid memory leaks in profiler
  • Overhead: Process ranges have slightly more overhead than push/pop ranges

Use Cases

Event Marking

  • Checkpoints: Save points, validation milestones
  • State Changes: Mode switches, phase transitions
  • External Events: Network requests, file I/O completion
  • Debug Points: Variable state inspection points

Process Ranges

  • Async Operations: Operations spanning multiple async calls
  • Cross-Thread Work: Work distributed across thread pools
  • Pipeline Stages: Multi-stage processing pipelines
  • Resource Lifetimes: Database connections, file handles

Install with Tessl CLI

npx tessl i tessl/pypi-nvtx

docs

annotation.md

domains.md

events-ranges.md

index.md

profiling.md

tile.json