AMD System Management Interface Go library for monitoring and managing AMD GPUs and EPYC CPUs
—
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Pending
The risk profile of this skill
The AMD System Management Interface (AMD SMI) Go library provides a Go interface for monitoring and managing AMD GPUs and EPYC CPUs in high-performance computing environments. This library wraps the native AMD SMI C library through CGo, offering access to hardware telemetry data including temperature, power consumption, utilization metrics, clock frequencies, and energy measurements.
github.com/ROCm/amdsmiimport "github.com/ROCm/amdsmi"package main
import (
"fmt"
"github.com/ROCm/amdsmi"
)
func main() {
// Initialize GPU subsystem
if !goamdsmi.GO_gpu_init() {
fmt.Println("Failed to initialize GPU")
return
}
defer goamdsmi.GO_gpu_shutdown()
// Get number of GPUs
numGPUs := goamdsmi.GO_gpu_num_monitor_devices()
fmt.Printf("Found %d GPU(s)\n", numGPUs)
// Query GPU metrics
for i := 0; i < int(numGPUs); i++ {
name := goamdsmi.GO_gpu_dev_name_get(i)
power := goamdsmi.GO_gpu_dev_power_get(i)
temp := goamdsmi.GO_gpu_dev_temp_metric_get(i, 1, 0)
fmt.Printf("GPU %d: %s, Power: %d, Temp: %d\n", i, name, power, temp)
}
}The goamdsmi package is a single-file Go library that uses CGo to interface with the AMD SMI C library (libgoamdsmi_shim64). All functions follow a consistent pattern:
GO_gpu_init() or GO_cpu_init() before using respective functionsC.uint32_t, C.uint64_t, *C.char) that may need conversion0xFFFFFFFF, false, "NA")Monitor and query AMD GPU devices including initialization, device enumeration, hardware information, power metrics, temperature monitoring, performance levels, clock frequencies, utilization, and memory usage.
// Initialization
func GO_gpu_init() bool
func GO_gpu_shutdown() bool
// Device Enumeration
func GO_gpu_num_monitor_devices() uint
// Device Information
func GO_gpu_dev_name_get(i int) *C.char
func GO_gpu_dev_id_get(i int) C.uint16_t
func GO_gpu_dev_pci_id_get(i int) C.uint64_t
func GO_gpu_dev_vbios_version_get(i int) *C.char
func GO_gpu_dev_vendor_name_get(i int) *C.charMonitor AMD EPYC CPU metrics including initialization, topology information, core energy, boost limits, socket energy, power, and PROCHOT status.
// Initialization
func GO_cpu_init() bool
// CPU Topology
func GO_cpu_number_of_sockets_get() uint
func GO_cpu_number_of_threads_get() uint
func GO_cpu_threads_per_core_get() uint
// Core Metrics
func GO_cpu_core_energy_get(i int) C.uint64_t
func GO_cpu_core_boostlimit_get(i int) C.uint32_tFunctions return sentinel values to indicate errors:
bool functions return false on erroruint functions return 0 on errorC.uint16_t functions return 0xFFFF on errorC.uint32_t functions return 0xFFFFFFFF on errorC.uint64_t functions return 0xFFFFFFFFFFFFFFFF on error*C.char functions return "NA" string on errorAlways check return values against these sentinel values to detect errors.
The package returns CGo types that often need conversion to native Go types:
// Converting C string to Go string
name := C.GoString(goamdsmi.GO_gpu_dev_name_get(0))
// CGo numeric types can be cast to Go types
deviceID := uint16(goamdsmi.GO_gpu_dev_id_get(0))
power := uint64(goamdsmi.GO_gpu_dev_power_get(0))This library requires:
/opt/rocm/lib or /opt/rocm/lib64Setup:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/rocm/lib:/opt/rocm/lib64