or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

examples

edge-cases.mdreal-world-scenarios.md
index.md
tile.json

device-discovery.mddocs/reference/

Device Discovery

Functions for discovering and identifying AMD processors (GPUs and CPUs). These functions allow you to enumerate devices, get device handles, query processor types, and retrieve unique identifiers like BDF addresses and UUIDs.

Capabilities

Get Processor Handles

Retrieve all processor handles across all sockets in the system.

def amdsmi_get_processor_handles() -> List[processor_handle]:
    """
    Get a list of all processor handles in the system.

    This function discovers all processors across all sockets and returns their handles.
    It internally calls amdsmi_get_socket_handles() to enumerate sockets, then retrieves
    all processors for each socket. The returned handles can represent GPUs, CPUs, or APUs
    depending on initialization flags.

    Returns:
    - List[processor_handle]: List of processor handles for all discovered devices.
      Empty list if no processors are found.

    Raises:
    - AmdSmiLibraryException: If the library is not initialized or discovery fails

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        # Get all processor handles
        processors = amdsmi.amdsmi_get_processor_handles()
        print(f"Found {len(processors)} processors")

        for i, processor in enumerate(processors):
            proc_type = amdsmi.amdsmi_get_processor_type(processor)
            print(f"Processor {i}: {proc_type['processor_type']}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get Socket Handles

Retrieve handles for all physical sockets in the system.

def amdsmi_get_socket_handles() -> List[processor_handle]:
    """
    Get a list of all socket handles in the system.

    A socket represents a physical CPU or GPU socket on the motherboard. This function
    returns handles for all detected sockets, which can then be used to query processors
    within each socket.

    Returns:
    - List[processor_handle]: List of socket handles. Empty list if no sockets are found.

    Raises:
    - AmdSmiLibraryException: If the library is not initialized or socket discovery fails

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        # Get all socket handles
        sockets = amdsmi.amdsmi_get_socket_handles()
        print(f"Found {len(sockets)} sockets")

        for i, socket in enumerate(sockets):
            socket_info = amdsmi.amdsmi_get_socket_info(socket)
            print(f"Socket {i}: {socket_info}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get Socket Information

Retrieve descriptive information about a socket.

def amdsmi_get_socket_info(socket_handle: processor_handle) -> str:
    """
    Get information string for a socket.

    Retrieves a descriptive string containing information about the specified socket,
    such as socket name or identifier.

    Parameters:
    - socket_handle (processor_handle): Handle for the socket to query

    Returns:
    - str: String containing socket information (e.g., socket name or identifier)

    Raises:
    - AmdSmiParameterException: If socket_handle is not valid
    - AmdSmiLibraryException: If unable to retrieve socket information

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        sockets = amdsmi.amdsmi_get_socket_handles()
        for socket in sockets:
            info = amdsmi.amdsmi_get_socket_info(socket)
            print(f"Socket info: {info}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get Processor Type

Determine the type of a processor (GPU, CPU, or APU).

def amdsmi_get_processor_type(processor_handle: processor_handle) -> Dict[str, str]:
    """
    Get the processor type for a given handle.

    Determines whether the processor is an AMD GPU, AMD CPU, AMD APU, or other type.

    Parameters:
    - processor_handle (processor_handle): Handle for the processor to query

    Returns:
    - Dict[str, str]: Dictionary containing:
        - "processor_type" (str): The processor type name, one of:
            - "AMDSMI_PROCESSOR_TYPE_AMD_GPU": AMD GPU
            - "AMDSMI_PROCESSOR_TYPE_AMD_CPU": AMD CPU
            - "AMDSMI_PROCESSOR_TYPE_NON_AMD_GPU": Non-AMD GPU
            - "AMDSMI_PROCESSOR_TYPE_NON_AMD_CPU": Non-AMD CPU
            - "UNKNOWN": Unknown processor type

    Raises:
    - AmdSmiParameterException: If processor_handle is not valid
    - AmdSmiLibraryException: If unable to determine processor type

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        processors = amdsmi.amdsmi_get_processor_handles()
        for processor in processors:
            type_info = amdsmi.amdsmi_get_processor_type(processor)
            print(f"Processor type: {type_info['processor_type']}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get Processor Handle from BDF

Look up a processor handle using its BDF (Bus:Device.Function) identifier.

def amdsmi_get_processor_handle_from_bdf(bdf: str) -> processor_handle:
    """
    Get a processor handle from a BDF (Bus:Device.Function) identifier.

    Given a PCI BDF address, this function returns the corresponding processor handle.
    The BDF format can be provided in several formats:
    - "XXXX:XX:XX.X" (domain:bus:device.function)
    - "XX:XX.X" (bus:device.function)
    - "XX:XX" (bus:device)

    Parameters:
    - bdf (str): BDF identifier string in one of the supported formats

    Returns:
    - processor_handle: Handle for the processor at the specified BDF address

    Raises:
    - AmdSmiBdfFormatException: If the BDF string format is invalid
    - AmdSmiLibraryException: If no processor is found at the specified BDF

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        # Look up processor by BDF
        processor = amdsmi.amdsmi_get_processor_handle_from_bdf("0000:01:00.0")

        # Verify it's a GPU
        type_info = amdsmi.amdsmi_get_processor_type(processor)
        print(f"Found {type_info['processor_type']} at BDF 0000:01:00.0")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get GPU Device BDF

Retrieve the BDF (Bus:Device.Function) address of a GPU device.

def amdsmi_get_gpu_device_bdf(processor_handle: processor_handle) -> str:
    """
    Get the BDF (Bus:Device.Function) identifier for a GPU device.

    Returns the PCI BDF address as a formatted string. This is useful for identifying
    the physical location of the GPU on the system bus.

    Parameters:
    - processor_handle (processor_handle): Handle for the GPU processor

    Returns:
    - str: BDF identifier string in format "XXXX:XX:XX.X" (domain:bus:device.function)

    Raises:
    - AmdSmiParameterException: If processor_handle is not valid
    - AmdSmiLibraryException: If unable to retrieve BDF information

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        processors = amdsmi.amdsmi_get_processor_handles()
        for processor in processors:
            bdf = amdsmi.amdsmi_get_gpu_device_bdf(processor)
            print(f"GPU BDF: {bdf}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get GPU Device UUID

Retrieve the universally unique identifier (UUID) of a GPU device.

def amdsmi_get_gpu_device_uuid(processor_handle: processor_handle) -> str:
    """
    Get the UUID (Universally Unique Identifier) for a GPU device.

    Returns a unique identifier string for the GPU. The UUID persists across reboots
    and is useful for tracking specific GPU devices in multi-GPU systems.

    Parameters:
    - processor_handle (processor_handle): Handle for the GPU processor

    Returns:
    - str: UUID string (38 characters including dashes and null terminator)

    Raises:
    - AmdSmiParameterException: If processor_handle is not valid
    - AmdSmiLibraryException: If unable to retrieve UUID

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        processors = amdsmi.amdsmi_get_processor_handles()
        for processor in processors:
            uuid = amdsmi.amdsmi_get_gpu_device_uuid(processor)
            bdf = amdsmi.amdsmi_get_gpu_device_bdf(processor)
            print(f"GPU at {bdf} has UUID: {uuid}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Get GPU Enumeration Information

Retrieve comprehensive enumeration information for a GPU including DRM, HSA, and HIP IDs.

def amdsmi_get_gpu_enumeration_info(processor_handle: processor_handle) -> Dict[str, Any]:
    """
    Retrieve GPU enumeration information including DRM card ID, DRM render ID, HIP ID, and HIP UUID.

    This function provides comprehensive enumeration data that maps GPU handles to various
    subsystem identifiers used by DRM (Direct Rendering Manager), HSA (Heterogeneous System
    Architecture), and HIP (Heterogeneous-compute Interface for Portability).

    Parameters:
    - processor_handle (processor_handle): Handle for the GPU processor

    Returns:
    - Dict[str, Any]: Dictionary containing enumeration information:
        - "drm_card" (int or None): DRM card ID (e.g., corresponds to /dev/dri/cardX)
        - "drm_render" (int or None): DRM render node ID (e.g., corresponds to /dev/dri/renderDX)
        - "hsa_id" (int or None): HSA node ID
        - "hip_id" (int or None): HIP device ID (used in HIP applications)
        - "hip_uuid" (str): HIP UUID string
      Note: Fields may be None if the corresponding subsystem is not available.

    Raises:
    - AmdSmiParameterException: If processor_handle is not valid
    - AmdSmiLibraryException: If unable to retrieve enumeration information

    Example:
    ```python
    import amdsmi

    amdsmi.amdsmi_init()
    try:
        processors = amdsmi.amdsmi_get_processor_handles()
        for i, processor in enumerate(processors):
            enum_info = amdsmi.amdsmi_get_gpu_enumeration_info(processor)
            print(f"GPU {i} enumeration info:")
            print(f"  DRM card: {enum_info['drm_card']}")
            print(f"  DRM render: {enum_info['drm_render']}")
            print(f"  HSA ID: {enum_info['hsa_id']}")
            print(f"  HIP ID: {enum_info['hip_id']}")
            print(f"  HIP UUID: {enum_info['hip_uuid']}")
    finally:
        amdsmi.amdsmi_shut_down()
    ```
    """

Usage Pattern

Basic Device Discovery Workflow

The typical workflow for discovering and enumerating devices:

import amdsmi

# Initialize library for GPU monitoring
amdsmi.amdsmi_init(amdsmi.AmdSmiInitFlags.INIT_AMD_GPUS)

try:
    # Method 1: Get all processors directly
    processors = amdsmi.amdsmi_get_processor_handles()
    print(f"Found {len(processors)} processors")

    # Query each processor
    for i, processor in enumerate(processors):
        # Get processor type
        proc_type = amdsmi.amdsmi_get_processor_type(processor)
        print(f"\nProcessor {i}:")
        print(f"  Type: {proc_type['processor_type']}")

        # Get BDF address
        bdf = amdsmi.amdsmi_get_gpu_device_bdf(processor)
        print(f"  BDF: {bdf}")

        # Get UUID
        uuid = amdsmi.amdsmi_get_gpu_device_uuid(processor)
        print(f"  UUID: {uuid}")

        # Get enumeration info
        enum_info = amdsmi.amdsmi_get_gpu_enumeration_info(processor)
        print(f"  DRM card: {enum_info['drm_card']}")
        print(f"  HIP ID: {enum_info['hip_id']}")

finally:
    amdsmi.amdsmi_shut_down()

Socket-Based Discovery

Discover devices organized by physical socket:

import amdsmi

amdsmi.amdsmi_init(amdsmi.AmdSmiInitFlags.INIT_AMD_GPUS)

try:
    # Get all sockets
    sockets = amdsmi.amdsmi_get_socket_handles()
    print(f"Found {len(sockets)} sockets")

    # Query each socket
    for socket_idx, socket in enumerate(sockets):
        socket_info = amdsmi.amdsmi_get_socket_info(socket)
        print(f"\nSocket {socket_idx}: {socket_info}")

        # Get processors for this socket
        # Note: amdsmi_get_processor_handles() returns all processors across all sockets
        # To get per-socket processors, you would need to use the C API directly
        # or filter based on NUMA node/topology information

finally:
    amdsmi.amdsmi_shut_down()

Look Up Device by BDF

Find a specific device by its PCI BDF address:

import amdsmi

amdsmi.amdsmi_init(amdsmi.AmdSmiInitFlags.INIT_AMD_GPUS)

try:
    # Look up a specific device by BDF
    target_bdf = "0000:03:00.0"

    try:
        processor = amdsmi.amdsmi_get_processor_handle_from_bdf(target_bdf)

        # Verify the device type
        proc_type = amdsmi.amdsmi_get_processor_type(processor)
        print(f"Found processor at {target_bdf}")
        print(f"Type: {proc_type['processor_type']}")

        # Get additional info
        uuid = amdsmi.amdsmi_get_gpu_device_uuid(processor)
        print(f"UUID: {uuid}")

    except amdsmi.AmdSmiBdfFormatException as e:
        print(f"Invalid BDF format: {target_bdf}")
    except amdsmi.AmdSmiLibraryException as e:
        print(f"No device found at BDF: {target_bdf}")

finally:
    amdsmi.amdsmi_shut_down()

Build Device Map

Create a comprehensive mapping of all devices with their identifiers:

import amdsmi

def build_device_map():
    """Build a comprehensive map of all discovered devices."""
    device_map = {}

    amdsmi.amdsmi_init(amdsmi.AmdSmiInitFlags.INIT_AMD_GPUS)

    try:
        processors = amdsmi.amdsmi_get_processor_handles()

        for processor in processors:
            # Get all identifiers
            proc_type = amdsmi.amdsmi_get_processor_type(processor)
            bdf = amdsmi.amdsmi_get_gpu_device_bdf(processor)
            uuid = amdsmi.amdsmi_get_gpu_device_uuid(processor)
            enum_info = amdsmi.amdsmi_get_gpu_enumeration_info(processor)

            # Store in map
            device_map[bdf] = {
                'handle': processor,
                'type': proc_type['processor_type'],
                'uuid': uuid,
                'bdf': bdf,
                'drm_card': enum_info['drm_card'],
                'drm_render': enum_info['drm_render'],
                'hsa_id': enum_info['hsa_id'],
                'hip_id': enum_info['hip_id'],
                'hip_uuid': enum_info['hip_uuid']
            }

        return device_map

    finally:
        amdsmi.amdsmi_shut_down()

# Usage
device_map = build_device_map()
for bdf, info in device_map.items():
    print(f"\n{bdf}:")
    print(f"  Type: {info['type']}")
    print(f"  UUID: {info['uuid']}")
    print(f"  HIP ID: {info['hip_id']}")

Filter Devices by Type

Separate GPUs from other processor types:

import amdsmi

amdsmi.amdsmi_init(amdsmi.AmdSmiInitFlags.INIT_ALL_PROCESSORS)

try:
    all_processors = amdsmi.amdsmi_get_processor_handles()

    gpus = []
    cpus = []
    other = []

    for processor in all_processors:
        proc_type = amdsmi.amdsmi_get_processor_type(processor)
        type_name = proc_type['processor_type']

        if 'GPU' in type_name:
            gpus.append(processor)
        elif 'CPU' in type_name:
            cpus.append(processor)
        else:
            other.append(processor)

    print(f"Found {len(gpus)} GPUs, {len(cpus)} CPUs, {len(other)} other processors")

    # Work with GPUs
    for gpu in gpus:
        bdf = amdsmi.amdsmi_get_gpu_device_bdf(gpu)
        print(f"GPU at {bdf}")

finally:
    amdsmi.amdsmi_shut_down()

Processor Type Enumeration

class AmdSmiProcessorType(IntEnum):
    """
    Processor type identifiers.
    """
    UNKNOWN = ...                          # Unknown processor type
    AMDSMI_PROCESSOR_TYPE_AMD_GPU = ...    # AMD GPU processor
    AMDSMI_PROCESSOR_TYPE_AMD_CPU = ...    # AMD CPU processor
    AMDSMI_PROCESSOR_TYPE_NON_AMD_GPU = ... # Non-AMD GPU processor
    AMDSMI_PROCESSOR_TYPE_NON_AMD_CPU = ... # Non-AMD CPU processor

Notes

  • The library must be initialized with amdsmi_init() before discovering devices
  • Use appropriate initialization flags to discover the desired device types:
    • INIT_AMD_GPUS: Discover AMD GPUs only (most common)
    • INIT_AMD_CPUS: Discover AMD CPUs (requires ESMI)
    • INIT_ALL_PROCESSORS: Discover all supported processor types
  • Processor handles remain valid until amdsmi_shut_down() is called
  • BDF addresses uniquely identify devices on the PCI bus
  • UUIDs provide persistent unique identifiers that survive reboots
  • The amdsmi_get_processor_handles() function returns processors across all sockets
  • For GPU-only systems, all processor handles will be GPU handles
  • The enumeration info maps device handles to subsystem IDs (DRM, HSA, HIP)
  • Some enumeration fields may be None if the corresponding subsystem is unavailable
  • BDF format is flexible and accepts various common notations