or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

device-discovery.mderror-handling-ras.mdevent-monitoring.mdhardware-information.mdindex.mdlibrary-management.mdmemory-management.mdpcie-connectivity.mdperformance-control.mdperformance-counters.mdperformance-monitoring.mdprocess-system-info.md

process-system-info.mddocs/

0

# Process and System Information

1

2

Process monitoring, system-level GPU usage information, and multi-process GPU utilization tracking for comprehensive system management.

3

4

## Capabilities

5

6

### GPU Process List

7

8

Get a list of processes currently running on a specific GPU, including process handles for detailed information retrieval.

9

10

```c { .api }

11

amdsmi_status_t amdsmi_get_gpu_process_list(amdsmi_processor_handle processor_handle, uint32_t *max_processes, amdsmi_process_handle_t *list);

12

```

13

14

**Parameters:**

15

- `processor_handle`: Handle to the GPU processor

16

- `max_processes`: As input, maximum number of process handles. As output, actual number available or written.

17

- `list`: Pointer to array of process handles, or NULL to query count only

18

19

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

20

21

**Usage Example:**

22

23

```c

24

// First get the count of processes

25

uint32_t num_processes = 0;

26

amdsmi_status_t ret = amdsmi_get_gpu_process_list(processor, &num_processes, NULL);

27

if (ret == AMDSMI_STATUS_SUCCESS && num_processes > 0) {

28

// Allocate memory and get process handles

29

amdsmi_process_handle_t *processes =

30

malloc(num_processes * sizeof(amdsmi_process_handle_t));

31

32

ret = amdsmi_get_gpu_process_list(processor, &num_processes, processes);

33

if (ret == AMDSMI_STATUS_SUCCESS) {

34

printf("Found %u processes using GPU\n", num_processes);

35

// Use process handles to get detailed information

36

}

37

free(processes);

38

}

39

```

40

41

### GPU Process Information

42

43

Get detailed information about a specific process running on a GPU.

44

45

```c { .api }

46

amdsmi_status_t amdsmi_get_gpu_process_info(amdsmi_processor_handle processor_handle, amdsmi_process_handle_t process, amdsmi_proc_info_t *info);

47

```

48

49

**Parameters:**

50

- `processor_handle`: Handle to the GPU processor

51

- `process`: Handle to the process to query

52

- `info`: Pointer to receive process information

53

54

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

55

56

**Usage Example:**

57

58

```c

59

amdsmi_proc_info_t proc_info;

60

amdsmi_status_t ret = amdsmi_get_gpu_process_info(processor, process_handle, &proc_info);

61

if (ret == AMDSMI_STATUS_SUCCESS) {

62

printf("Process Information:\n");

63

printf(" Name: %s\n", proc_info.name);

64

printf(" PID: %u\n", proc_info.pid);

65

printf(" Memory Usage: %llu MB\n", proc_info.mem / (1024*1024));

66

printf(" Container: %s\n", proc_info.container_name);

67

68

// Engine usage in nanoseconds

69

printf(" GFX Engine Time: %llu ns\n", proc_info.engine_usage.gfx);

70

printf(" Encoder Engine Time: %llu ns\n", proc_info.engine_usage.enc);

71

72

// Memory usage by type

73

printf(" GTT Memory: %llu MB\n", proc_info.memory_usage.gtt_mem / (1024*1024));

74

printf(" CPU Memory: %llu MB\n", proc_info.memory_usage.cpu_mem / (1024*1024));

75

printf(" VRAM Memory: %llu MB\n", proc_info.memory_usage.vram_mem / (1024*1024));

76

}

77

```

78

79

### System-Wide Compute Process Information

80

81

Get information about all compute processes currently using any GPU in the system.

82

83

```c { .api }

84

amdsmi_status_t amdsmi_get_gpu_compute_process_info(amdsmi_process_info_t *procs, uint32_t *num_items);

85

```

86

87

**Parameters:**

88

- `procs`: Pointer to array of process info structures, or NULL to query count only

89

- `num_items`: As input, maximum number of process info structures. As output, actual number available or written.

90

91

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

92

93

**Usage Example:**

94

95

```c

96

// Get count of compute processes

97

uint32_t num_compute_procs = 0;

98

amdsmi_status_t ret = amdsmi_get_gpu_compute_process_info(NULL, &num_compute_procs);

99

if (ret == AMDSMI_STATUS_SUCCESS && num_compute_procs > 0) {

100

// Allocate and get process information

101

amdsmi_process_info_t *compute_procs =

102

malloc(num_compute_procs * sizeof(amdsmi_process_info_t));

103

104

ret = amdsmi_get_gpu_compute_process_info(compute_procs, &num_compute_procs);

105

if (ret == AMDSMI_STATUS_SUCCESS) {

106

printf("System-wide compute processes (%u):\n", num_compute_procs);

107

for (uint32_t i = 0; i < num_compute_procs; i++) {

108

printf(" PID %u: VRAM %llu MB, SDMA %llu μs, CU %u%%\n",

109

compute_procs[i].process_id,

110

compute_procs[i].vram_usage / (1024*1024),

111

compute_procs[i].sdma_usage,

112

compute_procs[i].cu_occupancy);

113

}

114

}

115

free(compute_procs);

116

}

117

```

118

119

### Process Information by PID

120

121

Get compute process information for a specific process ID.

122

123

```c { .api }

124

amdsmi_status_t amdsmi_get_gpu_compute_process_info_by_pid(uint32_t pid, amdsmi_process_info_t *proc);

125

```

126

127

**Parameters:**

128

- `pid`: Process ID to query

129

- `proc`: Pointer to receive process information

130

131

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

132

133

### GPUs Used by Process

134

135

Get the list of GPU device indices that a specific process is currently using.

136

137

```c { .api }

138

amdsmi_status_t amdsmi_get_gpu_compute_process_gpus(uint32_t pid, uint32_t *dv_indices, uint32_t *num_devices);

139

```

140

141

**Parameters:**

142

- `pid`: Process ID to query

143

- `dv_indices`: Pointer to array of device indices, or NULL to query count only

144

- `num_devices`: As input, maximum number of device indices. As output, actual number available or written.

145

146

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

147

148

**Usage Example:**

149

150

```c

151

uint32_t pid = 12345; // Example PID

152

uint32_t num_devices = 0;

153

154

// Get count of GPUs used by process

155

amdsmi_status_t ret = amdsmi_get_gpu_compute_process_gpus(pid, NULL, &num_devices);

156

if (ret == AMDSMI_STATUS_SUCCESS && num_devices > 0) {

157

// Allocate and get device indices

158

uint32_t *device_indices = malloc(num_devices * sizeof(uint32_t));

159

ret = amdsmi_get_gpu_compute_process_gpus(pid, device_indices, &num_devices);

160

if (ret == AMDSMI_STATUS_SUCCESS) {

161

printf("Process %u is using %u GPUs: ", pid, num_devices);

162

for (uint32_t i = 0; i < num_devices; i++) {

163

printf("GPU%u ", device_indices[i]);

164

}

165

printf("\n");

166

}

167

free(device_indices);

168

}

169

```

170

171

## Python API

172

173

### GPU Process List

174

175

```python { .api }

176

def amdsmi_get_gpu_process_list(processor_handle):

177

"""

178

Get list of processes running on a GPU.

179

180

Args:

181

processor_handle: GPU processor handle

182

183

Returns:

184

list: List of process handle objects

185

186

Raises:

187

AmdSmiException: If process list query fails

188

"""

189

```

190

191

### GPU Process Information

192

193

```python { .api }

194

def amdsmi_get_gpu_process_info(processor_handle, process_handle):

195

"""

196

Get detailed information about a GPU process.

197

198

Args:

199

processor_handle: GPU processor handle

200

process_handle: Process handle object

201

202

Returns:

203

dict: Process info with keys 'name', 'pid', 'mem', 'container_name',

204

'engine_usage' (dict with 'gfx', 'enc'),

205

'memory_usage' (dict with 'gtt_mem', 'cpu_mem', 'vram_mem')

206

207

Raises:

208

AmdSmiException: If process info query fails

209

"""

210

```

211

212

### System Compute Processes

213

214

```python { .api }

215

def amdsmi_get_gpu_compute_process_info():

216

"""

217

Get information about all compute processes using GPUs.

218

219

Returns:

220

list: List of process info dicts with keys 'process_id', 'pasid',

221

'vram_usage', 'sdma_usage', 'cu_occupancy'

222

223

Raises:

224

AmdSmiException: If compute process query fails

225

"""

226

```

227

228

### Process Information by PID

229

230

```python { .api }

231

def amdsmi_get_gpu_compute_process_info_by_pid(pid):

232

"""

233

Get compute process information for a specific PID.

234

235

Args:

236

pid (int): Process ID to query

237

238

Returns:

239

dict: Process info with keys 'process_id', 'pasid', 'vram_usage',

240

'sdma_usage', 'cu_occupancy'

241

242

Raises:

243

AmdSmiException: If process query fails

244

"""

245

```

246

247

### GPUs Used by Process

248

249

```python { .api }

250

def amdsmi_get_gpu_compute_process_gpus(pid):

251

"""

252

Get list of GPU indices used by a process.

253

254

Args:

255

pid (int): Process ID to query

256

257

Returns:

258

list: List of GPU device indices

259

260

Raises:

261

AmdSmiException: If GPU query fails

262

"""

263

```

264

265

**Python Usage Example:**

266

267

```python

268

import amdsmi

269

270

# Initialize and get GPU handle

271

amdsmi.amdsmi_init()

272

273

try:

274

sockets = amdsmi.amdsmi_get_socket_handles()

275

processors = amdsmi.amdsmi_get_processor_handles(sockets[0])

276

gpu = processors[0]

277

278

# Get processes running on specific GPU

279

process_handles = amdsmi.amdsmi_get_gpu_process_list(gpu)

280

print(f"Found {len(process_handles)} processes on GPU")

281

282

for i, proc_handle in enumerate(process_handles):

283

proc_info = amdsmi.amdsmi_get_gpu_process_info(gpu, proc_handle)

284

print(f"Process {i+1}: {proc_info['name']} (PID: {proc_info['pid']})")

285

print(f" Memory: {proc_info['mem'] // (1024*1024)} MB")

286

print(f" VRAM: {proc_info['memory_usage']['vram_mem'] // (1024*1024)} MB")

287

print(f" Container: {proc_info['container_name']}")

288

289

# Get system-wide compute processes

290

compute_processes = amdsmi.amdsmi_get_gpu_compute_process_info()

291

print(f"\nSystem-wide compute processes: {len(compute_processes)}")

292

293

for proc in compute_processes:

294

print(f"PID {proc['process_id']}: "

295

f"VRAM {proc['vram_usage'] // (1024*1024)} MB, "

296

f"CU {proc['cu_occupancy']}%")

297

298

# Get which GPUs this process is using

299

gpu_indices = amdsmi.amdsmi_get_gpu_compute_process_gpus(proc['process_id'])

300

print(f" Using GPUs: {gpu_indices}")

301

302

finally:

303

amdsmi.amdsmi_shut_down()

304

```

305

306

## Types

307

308

### Process Information Structure (Detailed)

309

310

```c { .api }

311

typedef struct {

312

char name[AMDSMI_NORMAL_STRING_LENGTH]; // Process name

313

amdsmi_process_handle_t pid; // Process ID

314

uint64_t mem; // Memory usage in bytes

315

struct {

316

uint64_t gfx; // GFX engine time (ns)

317

uint64_t enc; // Encoder engine time (ns)

318

uint32_t reserved[12]; // Reserved

319

} engine_usage;

320

struct {

321

uint64_t gtt_mem; // GTT memory usage (bytes)

322

uint64_t cpu_mem; // CPU memory usage (bytes)

323

uint64_t vram_mem; // VRAM usage (bytes)

324

uint32_t reserved[10]; // Reserved

325

} memory_usage;

326

char container_name[AMDSMI_NORMAL_STRING_LENGTH]; // Container name

327

uint32_t reserved[4]; // Reserved

328

} amdsmi_proc_info_t;

329

```

330

331

### Process Information Structure (System-wide)

332

333

```c { .api }

334

typedef struct {

335

uint32_t process_id; // Process ID

336

uint32_t pasid; // Process Address Space ID

337

uint64_t vram_usage; // VRAM usage in bytes

338

uint64_t sdma_usage; // SDMA usage in microseconds

339

uint32_t cu_occupancy; // Compute Unit usage percentage

340

} amdsmi_process_info_t;

341

```

342

343

### Process Handle Type

344

345

```c { .api }

346

typedef uint32_t amdsmi_process_handle_t; // Process handle type

347

```

348

349

## Process Monitoring Workflow

350

351

A comprehensive process monitoring workflow includes:

352

353

1. **System Overview**: Use `amdsmi_get_gpu_compute_process_info()` to get system-wide GPU usage

354

2. **GPU-Specific Processes**: Use `amdsmi_get_gpu_process_list()` for specific GPU monitoring

355

3. **Detailed Process Info**: Use `amdsmi_get_gpu_process_info()` for in-depth process analysis

356

4. **Cross-Reference**: Use `amdsmi_get_gpu_compute_process_gpus()` to map processes to GPUs

357

5. **Targeted Monitoring**: Use `amdsmi_get_gpu_compute_process_info_by_pid()` for specific processes

358

359

## Important Notes

360

361

1. **Process Lifetime**: Process handles are valid only for the duration of the process and should not be cached.

362

363

2. **Memory Units**:

364

- Memory usage values are in bytes

365

- Engine usage times are in nanoseconds

366

- SDMA usage is in microseconds

367

368

3. **Multi-GPU Processes**: A single process can use multiple GPUs simultaneously.

369

370

4. **Container Support**: The library provides container name information for containerized workloads.

371

372

5. **Real-time Data**: Process information reflects current state and can change rapidly.

373

374

6. **Permission Requirements**: Some process information may require elevated privileges to access.

375

376

7. **Engine Usage**: Engine usage represents cumulative time spent using specific GPU engines, useful for understanding workload patterns.

377

378

8. **Memory Types**: Different memory types (GTT, CPU, VRAM) serve different purposes and have different performance characteristics.

379

380

9. **System Impact**: Process monitoring functions are lightweight and suitable for frequent polling in monitoring applications.