0
# Process and System Information
1
2
Process monitoring, system-level GPU usage information, and multi-process GPU utilization tracking for comprehensive system management.
3
4
## Capabilities
5
6
### GPU Process List
7
8
Get a list of processes currently running on a specific GPU, including process handles for detailed information retrieval.
9
10
```c { .api }
11
amdsmi_status_t amdsmi_get_gpu_process_list(amdsmi_processor_handle processor_handle, uint32_t *max_processes, amdsmi_process_handle_t *list);
12
```
13
14
**Parameters:**
15
- `processor_handle`: Handle to the GPU processor
16
- `max_processes`: As input, maximum number of process handles. As output, actual number available or written.
17
- `list`: Pointer to array of process handles, or NULL to query count only
18
19
**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure
20
21
**Usage Example:**
22
23
```c
24
// First get the count of processes
25
uint32_t num_processes = 0;
26
amdsmi_status_t ret = amdsmi_get_gpu_process_list(processor, &num_processes, NULL);
27
if (ret == AMDSMI_STATUS_SUCCESS && num_processes > 0) {
28
// Allocate memory and get process handles
29
amdsmi_process_handle_t *processes =
30
malloc(num_processes * sizeof(amdsmi_process_handle_t));
31
32
ret = amdsmi_get_gpu_process_list(processor, &num_processes, processes);
33
if (ret == AMDSMI_STATUS_SUCCESS) {
34
printf("Found %u processes using GPU\n", num_processes);
35
// Use process handles to get detailed information
36
}
37
free(processes);
38
}
39
```
40
41
### GPU Process Information
42
43
Get detailed information about a specific process running on a GPU.
44
45
```c { .api }
46
amdsmi_status_t amdsmi_get_gpu_process_info(amdsmi_processor_handle processor_handle, amdsmi_process_handle_t process, amdsmi_proc_info_t *info);
47
```
48
49
**Parameters:**
50
- `processor_handle`: Handle to the GPU processor
51
- `process`: Handle to the process to query
52
- `info`: Pointer to receive process information
53
54
**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure
55
56
**Usage Example:**
57
58
```c
59
amdsmi_proc_info_t proc_info;
60
amdsmi_status_t ret = amdsmi_get_gpu_process_info(processor, process_handle, &proc_info);
61
if (ret == AMDSMI_STATUS_SUCCESS) {
62
printf("Process Information:\n");
63
printf(" Name: %s\n", proc_info.name);
64
printf(" PID: %u\n", proc_info.pid);
65
printf(" Memory Usage: %llu MB\n", proc_info.mem / (1024*1024));
66
printf(" Container: %s\n", proc_info.container_name);
67
68
// Engine usage in nanoseconds
69
printf(" GFX Engine Time: %llu ns\n", proc_info.engine_usage.gfx);
70
printf(" Encoder Engine Time: %llu ns\n", proc_info.engine_usage.enc);
71
72
// Memory usage by type
73
printf(" GTT Memory: %llu MB\n", proc_info.memory_usage.gtt_mem / (1024*1024));
74
printf(" CPU Memory: %llu MB\n", proc_info.memory_usage.cpu_mem / (1024*1024));
75
printf(" VRAM Memory: %llu MB\n", proc_info.memory_usage.vram_mem / (1024*1024));
76
}
77
```
78
79
### System-Wide Compute Process Information
80
81
Get information about all compute processes currently using any GPU in the system.
82
83
```c { .api }
84
amdsmi_status_t amdsmi_get_gpu_compute_process_info(amdsmi_process_info_t *procs, uint32_t *num_items);
85
```
86
87
**Parameters:**
88
- `procs`: Pointer to array of process info structures, or NULL to query count only
89
- `num_items`: As input, maximum number of process info structures. As output, actual number available or written.
90
91
**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure
92
93
**Usage Example:**
94
95
```c
96
// Get count of compute processes
97
uint32_t num_compute_procs = 0;
98
amdsmi_status_t ret = amdsmi_get_gpu_compute_process_info(NULL, &num_compute_procs);
99
if (ret == AMDSMI_STATUS_SUCCESS && num_compute_procs > 0) {
100
// Allocate and get process information
101
amdsmi_process_info_t *compute_procs =
102
malloc(num_compute_procs * sizeof(amdsmi_process_info_t));
103
104
ret = amdsmi_get_gpu_compute_process_info(compute_procs, &num_compute_procs);
105
if (ret == AMDSMI_STATUS_SUCCESS) {
106
printf("System-wide compute processes (%u):\n", num_compute_procs);
107
for (uint32_t i = 0; i < num_compute_procs; i++) {
108
printf(" PID %u: VRAM %llu MB, SDMA %llu μs, CU %u%%\n",
109
compute_procs[i].process_id,
110
compute_procs[i].vram_usage / (1024*1024),
111
compute_procs[i].sdma_usage,
112
compute_procs[i].cu_occupancy);
113
}
114
}
115
free(compute_procs);
116
}
117
```
118
119
### Process Information by PID
120
121
Get compute process information for a specific process ID.
122
123
```c { .api }
124
amdsmi_status_t amdsmi_get_gpu_compute_process_info_by_pid(uint32_t pid, amdsmi_process_info_t *proc);
125
```
126
127
**Parameters:**
128
- `pid`: Process ID to query
129
- `proc`: Pointer to receive process information
130
131
**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure
132
133
### GPUs Used by Process
134
135
Get the list of GPU device indices that a specific process is currently using.
136
137
```c { .api }
138
amdsmi_status_t amdsmi_get_gpu_compute_process_gpus(uint32_t pid, uint32_t *dv_indices, uint32_t *num_devices);
139
```
140
141
**Parameters:**
142
- `pid`: Process ID to query
143
- `dv_indices`: Pointer to array of device indices, or NULL to query count only
144
- `num_devices`: As input, maximum number of device indices. As output, actual number available or written.
145
146
**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure
147
148
**Usage Example:**
149
150
```c
151
uint32_t pid = 12345; // Example PID
152
uint32_t num_devices = 0;
153
154
// Get count of GPUs used by process
155
amdsmi_status_t ret = amdsmi_get_gpu_compute_process_gpus(pid, NULL, &num_devices);
156
if (ret == AMDSMI_STATUS_SUCCESS && num_devices > 0) {
157
// Allocate and get device indices
158
uint32_t *device_indices = malloc(num_devices * sizeof(uint32_t));
159
ret = amdsmi_get_gpu_compute_process_gpus(pid, device_indices, &num_devices);
160
if (ret == AMDSMI_STATUS_SUCCESS) {
161
printf("Process %u is using %u GPUs: ", pid, num_devices);
162
for (uint32_t i = 0; i < num_devices; i++) {
163
printf("GPU%u ", device_indices[i]);
164
}
165
printf("\n");
166
}
167
free(device_indices);
168
}
169
```
170
171
## Python API
172
173
### GPU Process List
174
175
```python { .api }
176
def amdsmi_get_gpu_process_list(processor_handle):
177
"""
178
Get list of processes running on a GPU.
179
180
Args:
181
processor_handle: GPU processor handle
182
183
Returns:
184
list: List of process handle objects
185
186
Raises:
187
AmdSmiException: If process list query fails
188
"""
189
```
190
191
### GPU Process Information
192
193
```python { .api }
194
def amdsmi_get_gpu_process_info(processor_handle, process_handle):
195
"""
196
Get detailed information about a GPU process.
197
198
Args:
199
processor_handle: GPU processor handle
200
process_handle: Process handle object
201
202
Returns:
203
dict: Process info with keys 'name', 'pid', 'mem', 'container_name',
204
'engine_usage' (dict with 'gfx', 'enc'),
205
'memory_usage' (dict with 'gtt_mem', 'cpu_mem', 'vram_mem')
206
207
Raises:
208
AmdSmiException: If process info query fails
209
"""
210
```
211
212
### System Compute Processes
213
214
```python { .api }
215
def amdsmi_get_gpu_compute_process_info():
216
"""
217
Get information about all compute processes using GPUs.
218
219
Returns:
220
list: List of process info dicts with keys 'process_id', 'pasid',
221
'vram_usage', 'sdma_usage', 'cu_occupancy'
222
223
Raises:
224
AmdSmiException: If compute process query fails
225
"""
226
```
227
228
### Process Information by PID
229
230
```python { .api }
231
def amdsmi_get_gpu_compute_process_info_by_pid(pid):
232
"""
233
Get compute process information for a specific PID.
234
235
Args:
236
pid (int): Process ID to query
237
238
Returns:
239
dict: Process info with keys 'process_id', 'pasid', 'vram_usage',
240
'sdma_usage', 'cu_occupancy'
241
242
Raises:
243
AmdSmiException: If process query fails
244
"""
245
```
246
247
### GPUs Used by Process
248
249
```python { .api }
250
def amdsmi_get_gpu_compute_process_gpus(pid):
251
"""
252
Get list of GPU indices used by a process.
253
254
Args:
255
pid (int): Process ID to query
256
257
Returns:
258
list: List of GPU device indices
259
260
Raises:
261
AmdSmiException: If GPU query fails
262
"""
263
```
264
265
**Python Usage Example:**
266
267
```python
268
import amdsmi
269
270
# Initialize and get GPU handle
271
amdsmi.amdsmi_init()
272
273
try:
274
sockets = amdsmi.amdsmi_get_socket_handles()
275
processors = amdsmi.amdsmi_get_processor_handles(sockets[0])
276
gpu = processors[0]
277
278
# Get processes running on specific GPU
279
process_handles = amdsmi.amdsmi_get_gpu_process_list(gpu)
280
print(f"Found {len(process_handles)} processes on GPU")
281
282
for i, proc_handle in enumerate(process_handles):
283
proc_info = amdsmi.amdsmi_get_gpu_process_info(gpu, proc_handle)
284
print(f"Process {i+1}: {proc_info['name']} (PID: {proc_info['pid']})")
285
print(f" Memory: {proc_info['mem'] // (1024*1024)} MB")
286
print(f" VRAM: {proc_info['memory_usage']['vram_mem'] // (1024*1024)} MB")
287
print(f" Container: {proc_info['container_name']}")
288
289
# Get system-wide compute processes
290
compute_processes = amdsmi.amdsmi_get_gpu_compute_process_info()
291
print(f"\nSystem-wide compute processes: {len(compute_processes)}")
292
293
for proc in compute_processes:
294
print(f"PID {proc['process_id']}: "
295
f"VRAM {proc['vram_usage'] // (1024*1024)} MB, "
296
f"CU {proc['cu_occupancy']}%")
297
298
# Get which GPUs this process is using
299
gpu_indices = amdsmi.amdsmi_get_gpu_compute_process_gpus(proc['process_id'])
300
print(f" Using GPUs: {gpu_indices}")
301
302
finally:
303
amdsmi.amdsmi_shut_down()
304
```
305
306
## Types
307
308
### Process Information Structure (Detailed)
309
310
```c { .api }
311
typedef struct {
312
char name[AMDSMI_NORMAL_STRING_LENGTH]; // Process name
313
amdsmi_process_handle_t pid; // Process ID
314
uint64_t mem; // Memory usage in bytes
315
struct {
316
uint64_t gfx; // GFX engine time (ns)
317
uint64_t enc; // Encoder engine time (ns)
318
uint32_t reserved[12]; // Reserved
319
} engine_usage;
320
struct {
321
uint64_t gtt_mem; // GTT memory usage (bytes)
322
uint64_t cpu_mem; // CPU memory usage (bytes)
323
uint64_t vram_mem; // VRAM usage (bytes)
324
uint32_t reserved[10]; // Reserved
325
} memory_usage;
326
char container_name[AMDSMI_NORMAL_STRING_LENGTH]; // Container name
327
uint32_t reserved[4]; // Reserved
328
} amdsmi_proc_info_t;
329
```
330
331
### Process Information Structure (System-wide)
332
333
```c { .api }
334
typedef struct {
335
uint32_t process_id; // Process ID
336
uint32_t pasid; // Process Address Space ID
337
uint64_t vram_usage; // VRAM usage in bytes
338
uint64_t sdma_usage; // SDMA usage in microseconds
339
uint32_t cu_occupancy; // Compute Unit usage percentage
340
} amdsmi_process_info_t;
341
```
342
343
### Process Handle Type
344
345
```c { .api }
346
typedef uint32_t amdsmi_process_handle_t; // Process handle type
347
```
348
349
## Process Monitoring Workflow
350
351
A comprehensive process monitoring workflow includes:
352
353
1. **System Overview**: Use `amdsmi_get_gpu_compute_process_info()` to get system-wide GPU usage
354
2. **GPU-Specific Processes**: Use `amdsmi_get_gpu_process_list()` for specific GPU monitoring
355
3. **Detailed Process Info**: Use `amdsmi_get_gpu_process_info()` for in-depth process analysis
356
4. **Cross-Reference**: Use `amdsmi_get_gpu_compute_process_gpus()` to map processes to GPUs
357
5. **Targeted Monitoring**: Use `amdsmi_get_gpu_compute_process_info_by_pid()` for specific processes
358
359
## Important Notes
360
361
1. **Process Lifetime**: Process handles are valid only for the duration of the process and should not be cached.
362
363
2. **Memory Units**:
364
- Memory usage values are in bytes
365
- Engine usage times are in nanoseconds
366
- SDMA usage is in microseconds
367
368
3. **Multi-GPU Processes**: A single process can use multiple GPUs simultaneously.
369
370
4. **Container Support**: The library provides container name information for containerized workloads.
371
372
5. **Real-time Data**: Process information reflects current state and can change rapidly.
373
374
6. **Permission Requirements**: Some process information may require elevated privileges to access.
375
376
7. **Engine Usage**: Engine usage represents cumulative time spent using specific GPU engines, useful for understanding workload patterns.
377
378
8. **Memory Types**: Different memory types (GTT, CPU, VRAM) serve different purposes and have different performance characteristics.
379
380
9. **System Impact**: Process monitoring functions are lightweight and suitable for frequent polling in monitoring applications.