or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

device-discovery.mderror-handling-ras.mdevent-monitoring.mdhardware-information.mdindex.mdlibrary-management.mdmemory-management.mdpcie-connectivity.mdperformance-control.mdperformance-counters.mdperformance-monitoring.mdprocess-system-info.md

device-discovery.mddocs/

0

# Device Discovery

1

2

Functions for discovering and identifying AMD processors, sockets, and their properties in the system. The AMD SMI library uses a hierarchical model where sockets contain processors, enabling proper representation of multi-processor systems.

3

4

## Capabilities

5

6

### Socket Discovery

7

8

Get the list of socket handles available in the system. Sockets represent physical hardware locations that can contain multiple processors.

9

10

```c { .api }

11

amdsmi_status_t amdsmi_get_socket_handles(uint32_t *socket_count, amdsmi_socket_handle *socket_handles);

12

```

13

14

**Parameters:**

15

- `socket_count`: As input, the maximum number of socket handles that can be written to `socket_handles`. As output, the actual number of socket handles available or written.

16

- `socket_handles`: Pointer to array of socket handles, or NULL to query count only.

17

18

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

19

20

**Usage Example:**

21

22

```c

23

// First, get the count of available sockets

24

uint32_t socket_count = 0;

25

amdsmi_status_t ret = amdsmi_get_socket_handles(&socket_count, NULL);

26

if (ret != AMDSMI_STATUS_SUCCESS) {

27

return ret;

28

}

29

30

// Allocate memory and get the actual socket handles

31

amdsmi_socket_handle *sockets = malloc(socket_count * sizeof(amdsmi_socket_handle));

32

ret = amdsmi_get_socket_handles(&socket_count, sockets);

33

if (ret == AMDSMI_STATUS_SUCCESS) {

34

printf("Found %d sockets\n", socket_count);

35

}

36

```

37

38

### Socket Information

39

40

Get information about a specific socket, including its identifier.

41

42

```c { .api }

43

amdsmi_status_t amdsmi_get_socket_info(amdsmi_socket_handle socket_handle, size_t len, char *name);

44

```

45

46

**Parameters:**

47

- `socket_handle`: Handle to the socket to query

48

- `len`: Length of the caller-provided buffer `name`

49

- `name`: Buffer to receive the socket identifier string

50

51

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

52

53

### Processor Discovery

54

55

Get the list of processor handles associated with a specific socket.

56

57

```c { .api }

58

amdsmi_status_t amdsmi_get_processor_handles(amdsmi_socket_handle socket_handle, uint32_t *processor_count, amdsmi_processor_handle *processor_handles);

59

```

60

61

**Parameters:**

62

- `socket_handle`: Socket handle to query for processors

63

- `processor_count`: As input, maximum number of processor handles that can be written. As output, actual number available or written.

64

- `processor_handles`: Pointer to array of processor handles, or NULL to query count only.

65

66

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

67

68

**Usage Example:**

69

70

```c

71

// Get processor count for a socket

72

uint32_t processor_count = 0;

73

amdsmi_status_t ret = amdsmi_get_processor_handles(socket, &processor_count, NULL);

74

75

// Allocate and get processor handles

76

amdsmi_processor_handle *processors = malloc(processor_count * sizeof(amdsmi_processor_handle));

77

ret = amdsmi_get_processor_handles(socket, &processor_count, processors);

78

```

79

80

### Processor Type Identification

81

82

Determine the type of a specific processor (GPU, CPU, etc.).

83

84

```c { .api }

85

amdsmi_status_t amdsmi_get_processor_type(amdsmi_processor_handle processor_handle, processor_type_t *processor_type);

86

```

87

88

**Parameters:**

89

- `processor_handle`: Handle to the processor to query

90

- `processor_type`: Pointer to receive the processor type

91

92

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

93

94

### BDF-Based Discovery

95

96

Get a processor handle using Bus/Device/Function (BDF) information.

97

98

```c { .api }

99

amdsmi_status_t amdsmi_get_processor_handle_from_bdf(amdsmi_bdf_t bdf, amdsmi_processor_handle *processor_handle);

100

```

101

102

**Parameters:**

103

- `bdf`: BDF identifier structure

104

- `processor_handle`: Pointer to receive the matching processor handle

105

106

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

107

108

### Device Identification

109

110

Get the device ID for a GPU processor.

111

112

```c { .api }

113

amdsmi_status_t amdsmi_get_gpu_id(amdsmi_processor_handle processor_handle, uint16_t *id);

114

```

115

116

**Parameters:**

117

- `processor_handle`: Handle to the GPU processor

118

- `id`: Pointer to receive the device ID

119

120

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

121

122

### Vendor Information

123

124

Get vendor name string for a GPU processor.

125

126

```c { .api }

127

amdsmi_status_t amdsmi_get_gpu_vendor_name(amdsmi_processor_handle processor_handle, char *name, size_t len);

128

```

129

130

**Parameters:**

131

- `processor_handle`: Handle to the GPU processor

132

- `name`: Buffer to receive vendor name string

133

- `len`: Length of the caller-provided buffer

134

135

**Returns:** `amdsmi_status_t` - AMDSMI_STATUS_SUCCESS on success, error code on failure

136

137

### Subsystem Information

138

139

Get subsystem device ID and name for a GPU processor.

140

141

```c { .api }

142

amdsmi_status_t amdsmi_get_gpu_subsystem_id(amdsmi_processor_handle processor_handle, uint16_t *id);

143

amdsmi_status_t amdsmi_get_gpu_subsystem_name(amdsmi_processor_handle processor_handle, char *name, size_t len);

144

```

145

146

### VRAM Vendor Information

147

148

Get VRAM vendor information for a GPU processor.

149

150

```c { .api }

151

amdsmi_status_t amdsmi_get_gpu_vram_vendor(amdsmi_processor_handle processor_handle, char *brand, uint32_t len);

152

```

153

154

### Device BDF and UUID

155

156

Get Bus/Device/Function identifier and UUID for a GPU processor.

157

158

```c { .api }

159

amdsmi_status_t amdsmi_get_gpu_device_bdf(amdsmi_processor_handle processor_handle, amdsmi_bdf_t *bdf);

160

amdsmi_status_t amdsmi_get_gpu_device_uuid(amdsmi_processor_handle processor_handle, unsigned int *uuid_length, char *uuid);

161

amdsmi_status_t amdsmi_get_gpu_bdf_id(amdsmi_processor_handle processor_handle, uint64_t *bdfid);

162

```

163

164

**Parameters for UUID:**

165

- `processor_handle`: Handle to the GPU processor

166

- `uuid_length`: As input, length of UUID buffer. As output, actual UUID string length.

167

- `uuid`: Buffer to receive UUID string (must be at least AMDSMI_GPU_UUID_SIZE)

168

169

**Parameters for BDF ID:**

170

- `processor_handle`: Handle to the GPU processor

171

- `bdfid`: Pointer to receive BDF identifier as a single 64-bit integer

172

173

## Python API

174

175

### Socket Discovery

176

177

```python { .api }

178

def amdsmi_get_socket_handles():

179

"""

180

Get list of socket handles in the system.

181

182

Returns:

183

list: List of socket handle objects

184

185

Raises:

186

AmdSmiException: If socket discovery fails

187

"""

188

```

189

190

### Socket Information

191

192

```python { .api }

193

def amdsmi_get_socket_info(socket_handle):

194

"""

195

Get information about a socket.

196

197

Args:

198

socket_handle: Socket handle object

199

200

Returns:

201

str: Socket identifier string

202

203

Raises:

204

AmdSmiException: If socket info retrieval fails

205

"""

206

```

207

208

### Processor Discovery

209

210

```python { .api }

211

def amdsmi_get_processor_handles(socket_handle):

212

"""

213

Get list of processor handles for a socket.

214

215

Args:

216

socket_handle: Socket handle object

217

218

Returns:

219

list: List of processor handle objects

220

221

Raises:

222

AmdSmiException: If processor discovery fails

223

"""

224

```

225

226

### Processor Type

227

228

```python { .api }

229

def amdsmi_get_processor_type(processor_handle):

230

"""

231

Get the type of a processor.

232

233

Args:

234

processor_handle: Processor handle object

235

236

Returns:

237

AmdSmiProcessorType: Processor type enum value

238

239

Raises:

240

AmdSmiException: If processor type query fails

241

"""

242

```

243

244

**Python Usage Example:**

245

246

```python

247

import amdsmi

248

249

# Initialize library

250

amdsmi.amdsmi_init()

251

252

try:

253

# Get all sockets

254

sockets = amdsmi.amdsmi_get_socket_handles()

255

print(f"Found {len(sockets)} sockets")

256

257

for i, socket in enumerate(sockets):

258

# Get socket info

259

socket_info = amdsmi.amdsmi_get_socket_info(socket)

260

print(f"Socket {i}: {socket_info}")

261

262

# Get processors for this socket

263

processors = amdsmi.amdsmi_get_processor_handles(socket)

264

print(f" Found {len(processors)} processors")

265

266

for j, processor in enumerate(processors):

267

# Get processor type

268

proc_type = amdsmi.amdsmi_get_processor_type(processor)

269

print(f" Processor {j}: {proc_type}")

270

271

# If it's a GPU, get more details

272

if proc_type == amdsmi.AmdSmiProcessorType.AMD_GPU:

273

device_id = amdsmi.amdsmi_get_gpu_id(processor)

274

vendor_name = amdsmi.amdsmi_get_gpu_vendor_name(processor)

275

uuid = amdsmi.amdsmi_get_gpu_device_uuid(processor)

276

print(f" GPU ID: 0x{device_id:04x}")

277

print(f" Vendor: {vendor_name}")

278

print(f" UUID: {uuid}")

279

280

finally:

281

amdsmi.amdsmi_shut_down()

282

```

283

284

## Types

285

286

### Handle Types

287

288

```c { .api }

289

typedef void *amdsmi_socket_handle; // Opaque socket handle

290

typedef void *amdsmi_processor_handle; // Opaque processor handle

291

```

292

293

### Processor Types

294

295

```c { .api }

296

typedef enum {

297

UNKNOWN = 0, // Unknown processor type

298

AMD_GPU, // AMD GPU processor

299

AMD_CPU, // AMD CPU processor

300

NON_AMD_GPU, // Non-AMD GPU processor

301

NON_AMD_CPU // Non-AMD CPU processor

302

} processor_type_t;

303

```

304

305

### BDF Structure

306

307

```c { .api }

308

typedef union {

309

struct {

310

uint64_t function_number : 3; // PCI function number

311

uint64_t device_number : 5; // PCI device number

312

uint64_t bus_number : 8; // PCI bus number

313

uint64_t domain_number : 48; // PCI domain number

314

};

315

uint64_t as_uint; // BDF as single integer

316

} amdsmi_bdf_t;

317

```

318

319

## Constants

320

321

```c { .api }

322

#define AMDSMI_GPU_UUID_SIZE 38 // GPU UUID string size

323

#define AMDSMI_MAX_STRING_LENGTH 64 // Maximum string length

324

#define AMDSMI_NORMAL_STRING_LENGTH 32 // Normal string length

325

#define AMDSMI_MAX_DEVICES 32 // Maximum number of devices

326

```

327

328

## Discovery Workflow

329

330

The typical device discovery workflow follows this pattern:

331

332

1. **Initialize Library**: Call `amdsmi_init()` with appropriate flags

333

2. **Discover Sockets**: Use `amdsmi_get_socket_handles()` to find available sockets

334

3. **Get Socket Info**: Optionally get socket identifier with `amdsmi_get_socket_info()`

335

4. **Discover Processors**: For each socket, use `amdsmi_get_processor_handles()`

336

5. **Identify Processors**: Use `amdsmi_get_processor_type()` to determine processor types

337

6. **Get Device Details**: For GPUs, get additional identification information

338

339

This hierarchical approach allows the library to properly represent complex multi-processor systems while providing efficient access to individual devices.

340

341

## Important Notes

342

343

1. **Handle Lifetime**: Socket and processor handles remain valid until `amdsmi_shut_down()` is called.

344

345

2. **Handle Uniqueness**: Handles may change between application runs, so they should not be cached persistently.

346

347

3. **Initialization Dependency**: The types of processors discovered depend on the flags passed to `amdsmi_init()`.

348

349

4. **Error Handling**: Always check return values, as device discovery can fail due to driver issues or hardware problems.

350

351

5. **Memory Management**: When querying counts, always allocate sufficient memory for the handle arrays.