CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/go-amdsmi

AMD System Management Interface (AMD SMI) Go library for unified GPU and CPU management and monitoring

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

cpu-management.mddocs/

CPU Management (ESMI)

CPU power monitoring, frequency control, temperature sensors, and energy consumption tracking for AMD EPYC processors. Requires ESMI (Energy System Management Interface) support.

Capabilities

CPU Socket Energy Monitoring

Monitor energy consumption at the CPU socket level.

amdsmi_status_t amdsmi_get_cpu_socket_energy(amdsmi_processor_handle processor_handle,
                                            uint64_t* penergy);

Usage Example:

uint64_t energy;
amdsmi_status_t status = amdsmi_get_cpu_socket_energy(cpu_handle, &energy);

if (status == AMDSMI_STATUS_SUCCESS) {
    printf("CPU Socket Energy: %lu microjoules\n", energy);
}

CPU Socket Power Monitoring

Get current power consumption at the CPU socket level.

amdsmi_status_t amdsmi_get_cpu_socket_power(amdsmi_processor_handle processor_handle,
                                           uint32_t* ppower);

CPU Core Energy Monitoring

Monitor energy consumption at individual CPU core level.

amdsmi_status_t amdsmi_get_cpu_core_energy(amdsmi_processor_handle processor_handle,
                                          uint64_t* penergy);

CPU Frequency Information

Get current frequency limits and operating frequencies for CPU cores.

amdsmi_status_t amdsmi_get_cpu_core_current_freq_limit(amdsmi_processor_handle processor_handle,
                                                      uint32_t* freq);
amdsmi_status_t amdsmi_get_cpu_socket_current_active_freq_limit(amdsmi_processor_handle processor_handle,
                                                               uint16_t* freq);
amdsmi_status_t amdsmi_get_cpu_socket_freq_range(amdsmi_processor_handle processor_handle,
                                                uint16_t* fmax,
                                                uint16_t* fmin);

CPU Temperature Monitoring

Monitor CPU temperatures across different sensor locations.

amdsmi_status_t amdsmi_get_cpu_socket_temperature(amdsmi_processor_handle processor_handle,
                                                 uint32_t* ptmon);

CPU System Information

Get CPU topology and configuration information.

amdsmi_status_t amdsmi_get_threads_per_core(amdsmi_processor_handle processor_handle,
                                           uint32_t* threads_per_core);
amdsmi_status_t amdsmi_get_cpu_fclk_mclk(amdsmi_processor_handle processor_handle,
                                        uint32_t* fclk,
                                        uint32_t* mclk);

CPU Handle Discovery

Get CPU-specific processor handles for management operations.

amdsmi_status_t amdsmi_get_cpusocket_handles(uint32_t* num_sockets,
                                           amdsmi_cpusocket_handle* socket_handles);
amdsmi_status_t amdsmi_get_cpucore_handles(amdsmi_cpusocket_handle socket_handle,
                                         uint32_t* num_cores,
                                         amdsmi_cpucore_handle* core_handles);

CPU Driver Information

Get HSMP (Host System Management Port) driver version and protocol information.

amdsmi_status_t amdsmi_get_cpu_hsmp_driver_version(amdsmi_processor_handle processor_handle,
                                                  uint32_t* version);
amdsmi_status_t amdsmi_get_cpu_hsmp_proto_ver(amdsmi_processor_handle processor_handle,
                                             uint32_t* proto_ver);

CPU Thermal Status

Monitor CPU thermal status and PROCHOT (processor hot) conditions.

amdsmi_status_t amdsmi_get_cpu_prochot_status(amdsmi_processor_handle processor_handle,
                                             uint32_t* prochot);

Language Interface Examples

Python

import amdsmi

# Get CPU socket handles
try:
    cpu_socket_handles = amdsmi.amdsmi_get_cpusocket_handles()
    print(f"Found {len(cpu_socket_handles)} CPU socket(s)")
    
    for i, socket_handle in enumerate(cpu_socket_handles):
        print(f"CPU Socket {i}:")
        
        # Get socket power
        power = amdsmi.amdsmi_get_cpu_socket_power(socket_handle)
        print(f"  Power: {power}W")
        
        # Get socket energy
        energy = amdsmi.amdsmi_get_cpu_socket_energy(socket_handle)
        print(f"  Energy: {energy} microjoules")
        
        # Get socket temperature
        temp = amdsmi.amdsmi_get_cpu_socket_temperature(socket_handle)
        print(f"  Temperature: {temp / 1000}°C")
        
        # Get frequency information
        freq_limit = amdsmi.amdsmi_get_cpu_socket_current_active_freq_limit(socket_handle)
        print(f"  Current Frequency Limit: {freq_limit} MHz")
        
        # Get frequency range
        freq_range = amdsmi.amdsmi_get_cpu_socket_freq_range(socket_handle)
        print(f"  Frequency Range: {freq_range['fmin']} - {freq_range['fmax']} MHz")
        
        # Get PROCHOT status
        prochot = amdsmi.amdsmi_get_cpu_prochot_status(socket_handle)
        print(f"  PROCHOT Status: {'Active' if prochot else 'Inactive'}")
        
        # Get core handles for this socket
        core_handles = amdsmi.amdsmi_get_cpucore_handles(socket_handle)
        print(f"  CPU Cores: {len(core_handles)}")
        
        # Get per-core information
        for j, core_handle in enumerate(core_handles[:4]):  # Show first 4 cores
            core_energy = amdsmi.amdsmi_get_cpu_core_energy(core_handle)
            core_freq = amdsmi.amdsmi_get_cpu_core_current_freq_limit(core_handle)
            print(f"    Core {j}: {core_freq} MHz, {core_energy} μJ")

except amdsmi.AmdSmiException as e:
    print(f"CPU monitoring not available: {e}")

Rust

use amdsmi::{get_cpusocket_handles, get_cpu_socket_power, get_cpu_socket_energy};
use amdsmi::{get_cpu_socket_temperature, get_cpucore_handles};

// Monitor CPU sockets
match get_cpusocket_handles() {
    Ok(socket_handles) => {
        println!("Found {} CPU socket(s)", socket_handles.len());
        
        for (i, socket_handle) in socket_handles.iter().enumerate() {
            println!("CPU Socket {}:", i);
            
            // Get power consumption
            if let Ok(power) = get_cpu_socket_power(*socket_handle) {
                println!("  Power: {}W", power);
            }
            
            // Get energy consumption
            if let Ok(energy) = get_cpu_socket_energy(*socket_handle) {
                println!("  Energy: {} μJ", energy);
            }
            
            // Get temperature
            if let Ok(temp) = get_cpu_socket_temperature(*socket_handle) {
                println!("  Temperature: {}°C", temp / 1000);
            }
            
            // Get core handles
            if let Ok(core_handles) = get_cpucore_handles(*socket_handle) {
                println!("  CPU Cores: {}", core_handles.len());
            }
        }
    },
    Err(e) => println!("CPU monitoring not available: {:?}", e),
}

CPU Management Requirements

  1. ESMI Support: Requires AMD EPYC processors with ESMI support
  2. Driver Requirements: Needs HSMP driver loaded and accessible
  3. Permissions: May require elevated privileges for some operations
  4. Platform Support: Currently limited to AMD EPYC server platforms
  5. Initialization: Must initialize with AMDSMI_INIT_AMD_CPUS flag

CPU Management Best Practices

  1. Energy Monitoring: Use energy counters for accurate power profiling
  2. Thermal Monitoring: Monitor PROCHOT status to detect thermal throttling
  3. Core-Level Granularity: Use per-core monitoring for detailed analysis
  4. Driver Compatibility: Check HSMP driver version for feature availability
  5. Platform Validation: Verify ESMI support before attempting CPU operations

Install with Tessl CLI

npx tessl i tessl/go-amdsmi

docs

cpu-management.md

device-info.md

events.md

gpu-performance.md

index.md

initialization.md

memory.md

performance-control.md

power-thermal.md

topology-ras.md

tile.json