Python library for monitoring and managing AMD GPUs and CPUs with programmatic hardware metrics access
npx @tessl/cli install tessl/pypi-amdsmi@7.0.0The AMD System Management Interface (AMDSMI) Python library provides comprehensive monitoring and management capabilities for AMD GPUs and CPUs. It offers programmatic access to hardware metrics including GPU utilization, temperature, power consumption, memory usage, clock frequencies, and extensive CPU monitoring through Python bindings wrapping the native AMD SMI C library.
pip install amdsmiimport amdsmi
# Initialize library for GPU monitoring
amdsmi.amdsmi_init(amdsmi.AmdSmiInitFlags.INIT_AMD_GPUS)
try:
# Get list of GPU devices
devices = amdsmi.amdsmi_get_processor_handles()
# Query GPU information
for device in devices:
# Get ASIC information
asic_info = amdsmi.amdsmi_get_gpu_asic_info(device)
print(f"GPU: {asic_info['market_name']}")
# Get temperature
temp = amdsmi.amdsmi_get_temp_metric(
device,
amdsmi.AmdSmiTemperatureType.EDGE,
amdsmi.AmdSmiTemperatureMetric.CURRENT
)
print(f"Temperature: {temp} C")
# Get GPU activity
activity = amdsmi.amdsmi_get_gpu_activity(device)
print(f"GFX Activity: {activity['gfx_activity']}%")
# Get power consumption
power = amdsmi.amdsmi_get_power_info(device)
print(f"Power: {power['current_socket_power']} W")
# Get VRAM usage
vram = amdsmi.amdsmi_get_gpu_vram_usage(device)
print(f"VRAM Used: {vram['vram_used']} bytes")
finally:
# Always clean up
amdsmi.amdsmi_shut_down()For detailed getting started instructions, see the Quick Start Guide.
The AMDSMI library is organized around these key components:
Initialize and shut down the AMDSMI library. Must be called before using any other functions.
Key Functions:
amdsmi_init(flag) - Initialize the libraryamdsmi_shut_down() - Shut down and release resourcesDiscover and identify available processors (GPUs and CPUs), retrieve device handles, and query device enumeration information.
Key Functions:
amdsmi_get_processor_handles() - Get all processor handlesamdsmi_get_processor_type(handle) - Get processor typeamdsmi_get_gpu_device_bdf(handle) - Get BDF identifieramdsmi_get_gpu_device_uuid(handle) - Get UUIDReal-time monitoring of GPU activity, utilization, VRAM usage, power consumption, clock frequencies, and PCIe information.
Key Functions:
amdsmi_get_gpu_activity(handle) - Get utilization metricsamdsmi_get_gpu_vram_usage(handle) - Get VRAM statisticsamdsmi_get_power_info(handle) - Get power consumptionamdsmi_get_clock_info(handle, clock_type) - Get clock frequenciesMonitor GPU temperatures across various sensors including edge, hotspot, junction, VRAM, and HBM temperatures.
Key Functions:
amdsmi_get_temp_metric(handle, sensor_type, metric) - Get temperature metricQuery static GPU device information including ASIC details, VBIOS, firmware versions, driver information, and board information.
Key Functions:
amdsmi_get_gpu_asic_info(handle) - Get ASIC informationamdsmi_get_gpu_driver_info(handle) - Get driver informationamdsmi_get_gpu_vbios_info(handle) - Get VBIOS informationQuery GPU memory information including total memory, usage, memory types, and reserved pages.
Key Functions:
amdsmi_get_gpu_memory_total(handle, mem_type) - Get total memoryamdsmi_get_gpu_memory_usage(handle, mem_type) - Get memory usageamdsmi_get_gpu_vram_info(handle) - Get detailed VRAM informationConfigure and control GPU settings including power caps, clock frequencies, fan speeds, performance levels, and overdrive settings.
Key Functions:
amdsmi_set_power_cap(handle, sensor_ind, cap) - Set power capamdsmi_set_gpu_perf_level(handle, perf_level) - Set performance levelamdsmi_set_gpu_fan_speed(handle, sensor_ind, speed) - Set fan speedCreate, control, and read GPU performance counters for detailed performance analysis.
Key Functions:
amdsmi_gpu_create_counter(handle, event_type) - Create counteramdsmi_gpu_read_counter(handle, counter_handle) - Read counter valueQuery GPU error information including ECC errors, RAS features, and error counts for various GPU blocks.
Key Functions:
amdsmi_get_gpu_ecc_count(handle, block) - Get ECC error countamdsmi_get_gpu_ecc_enabled(handle) - Check if ECC is enabledMonitor GPU processes, including process list and per-process resource usage.
Key Functions:
amdsmi_get_gpu_process_list(handle) - Get process listamdsmi_get_gpu_compute_process_info(handle) - Get compute process infoQuery hardware topology including NUMA affinity, P2P connectivity, link metrics, and XGMI information.
Key Functions:
amdsmi_topo_get_numa_node_number(handle) - Get NUMA nodeamdsmi_is_P2P_accessible(handle_src, handle_dst) - Check P2P accessibilityamdsmi_get_link_metrics(handle) - Get link metricsConfigure and query GPU compute and memory partitioning for multi-instance GPU (MIG) support.
Key Functions:
amdsmi_get_gpu_compute_partition(handle) - Get compute partitionamdsmi_set_gpu_compute_partition(handle, partition_type) - Set compute partitionamdsmi_get_gpu_memory_partition(handle) - Get memory partitionamdsmi_set_gpu_memory_partition(handle, memory_partition) - Set memory partitionMonitor AMD CPUs including energy consumption, power, temperature, frequencies, and HSMP metrics (requires ESMI library support).
Key Functions:
amdsmi_get_cpusocket_handles() - Get CPU socket handlesamdsmi_get_cpu_socket_power(socket_handle) - Get CPU poweramdsmi_get_cpu_socket_temperature(socket_handle) - Get CPU temperatureRead and monitor asynchronous GPU events including thermal throttling, VM faults, and GPU resets.
Key Classes:
AmdSmiEventReader - Event reader classQuery library version, ROCm version, and convert status codes to strings.
Key Functions:
amdsmi_get_lib_version() - Get library versionamdsmi_get_rocm_version() - Get ROCm versionamdsmi_status_code_to_string(status) - Convert status code to stringimport amdsmifrom amdsmi import (
amdsmi_init,
amdsmi_shut_down,
amdsmi_get_processor_handles,
amdsmi_get_gpu_activity,
AmdSmiInitFlags,
AmdSmiTemperatureType,
AmdSmiTemperatureMetric
)AmdSmiInitFlags.INIT_ALL_PROCESSORS - Initialize all processor typesAmdSmiInitFlags.INIT_AMD_CPUS - Initialize AMD CPUs onlyAmdSmiInitFlags.INIT_AMD_GPUS - Initialize AMD GPUs only (default)AmdSmiInitFlags.INIT_AMD_APUS - Initialize AMD APUs onlyAmdSmiException (base)
AmdSmiLibraryException (with error code mapping)
AmdSmiRetryExceptionAmdSmiTimeoutExceptionAmdSmiParameterExceptionAmdSmiKeyExceptionAmdSmiBdfFormatExceptionFor complete type definitions and constants, see the Type Reference.
Step-by-step instructions for common workflows:
Real-world usage scenarios:
Complete API documentation: