or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

cpu-management.mdgpu-management.mdindex.md
tile.json

tessl/golang-github-com-rocm-amdsmi

AMD System Management Interface Go library for monitoring and managing AMD GPUs and EPYC CPUs

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
golangpkg:golang/github.com/ROCm/amdsmi@v7.1.1

To install, run

npx @tessl/cli install tessl/golang-github-com-rocm-amdsmi@7.1.0

index.mddocs/

AMD SMI Go Interface (goamdsmi)

The AMD System Management Interface (AMD SMI) Go library provides a Go interface for monitoring and managing AMD GPUs and EPYC CPUs in high-performance computing environments. This library wraps the native AMD SMI C library through CGo, offering access to hardware telemetry data including temperature, power consumption, utilization metrics, clock frequencies, and energy measurements.

Package Information

  • Package Name: goamdsmi
  • Package Type: Go library
  • Language: Go (CGo)
  • Import Path: github.com/ROCm/amdsmi
  • Version: 7.1.1
  • Requirements: Go 1.20+, amdgpu driver, ROCm installation
  • Documentation: https://rocm.docs.amd.com/projects/amdsmi/en/latest/how-to/amdsmi-go-lib.html

Core Import

import "github.com/ROCm/amdsmi"

Basic Usage

package main

import (
    "fmt"
    "github.com/ROCm/amdsmi"
)

func main() {
    // Initialize GPU subsystem
    if !goamdsmi.GO_gpu_init() {
        fmt.Println("Failed to initialize GPU")
        return
    }
    defer goamdsmi.GO_gpu_shutdown()

    // Get number of GPUs
    numGPUs := goamdsmi.GO_gpu_num_monitor_devices()
    fmt.Printf("Found %d GPU(s)\n", numGPUs)

    // Query GPU metrics
    for i := 0; i < int(numGPUs); i++ {
        name := goamdsmi.GO_gpu_dev_name_get(i)
        power := goamdsmi.GO_gpu_dev_power_get(i)
        temp := goamdsmi.GO_gpu_dev_temp_metric_get(i, 1, 0)

        fmt.Printf("GPU %d: %s, Power: %d, Temp: %d\n", i, name, power, temp)
    }
}

Architecture

The goamdsmi package is a single-file Go library that uses CGo to interface with the AMD SMI C library (libgoamdsmi_shim64). All functions follow a consistent pattern:

  • Initialization required: Call GO_gpu_init() or GO_cpu_init() before using respective functions
  • Index-based access: Functions take integer indices to specify which device to query
  • CGo return types: Functions return CGo types (C.uint32_t, C.uint64_t, *C.char) that may need conversion
  • Error signaling: Functions return sentinel values on error (e.g., 0xFFFFFFFF, false, "NA")

Capabilities

GPU Management

Monitor and query AMD GPU devices including initialization, device enumeration, hardware information, power metrics, temperature monitoring, performance levels, clock frequencies, utilization, and memory usage.

// Initialization
func GO_gpu_init() bool
func GO_gpu_shutdown() bool

// Device Enumeration
func GO_gpu_num_monitor_devices() uint

// Device Information
func GO_gpu_dev_name_get(i int) *C.char
func GO_gpu_dev_id_get(i int) C.uint16_t
func GO_gpu_dev_pci_id_get(i int) C.uint64_t
func GO_gpu_dev_vbios_version_get(i int) *C.char
func GO_gpu_dev_vendor_name_get(i int) *C.char

GPU Management

CPU Management

Monitor AMD EPYC CPU metrics including initialization, topology information, core energy, boost limits, socket energy, power, and PROCHOT status.

// Initialization
func GO_cpu_init() bool

// CPU Topology
func GO_cpu_number_of_sockets_get() uint
func GO_cpu_number_of_threads_get() uint
func GO_cpu_threads_per_core_get() uint

// Core Metrics
func GO_cpu_core_energy_get(i int) C.uint64_t
func GO_cpu_core_boostlimit_get(i int) C.uint32_t

CPU Management

Error Handling

Functions return sentinel values to indicate errors:

  • bool functions return false on error
  • uint functions return 0 on error
  • C.uint16_t functions return 0xFFFF on error
  • C.uint32_t functions return 0xFFFFFFFF on error
  • C.uint64_t functions return 0xFFFFFFFFFFFFFFFF on error
  • *C.char functions return "NA" string on error

Always check return values against these sentinel values to detect errors.

CGo Type Conversions

The package returns CGo types that often need conversion to native Go types:

// Converting C string to Go string
name := C.GoString(goamdsmi.GO_gpu_dev_name_get(0))

// CGo numeric types can be cast to Go types
deviceID := uint16(goamdsmi.GO_gpu_dev_id_get(0))
power := uint64(goamdsmi.GO_gpu_dev_power_get(0))

Dependencies

This library requires:

  • amdgpu driver: Must be loaded for initialization to succeed
  • ROCm installation: Libraries must be available in /opt/rocm/lib or /opt/rocm/lib64
  • LD_LIBRARY_PATH: Must include ROCm library paths

Setup:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/rocm/lib:/opt/rocm/lib64