or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

authentication.mdcore-database.mddata-science.mddata-types.mderror-handling.mdindex.mdmetadata.md

metadata.mddocs/

0

# Database Metadata and Introspection

1

2

Database schema introspection capabilities for retrieving metadata about tables, columns, procedures, and other database objects. The redshift_connector provides comprehensive metadata operations that enable applications to dynamically discover and work with database schemas.

3

4

## Capabilities

5

6

### Table Metadata

7

8

Retrieve information about tables, views, and other relations in the database with optional filtering by catalog, schema, and table name patterns.

9

10

```python { .api }

11

class Cursor:

12

def get_tables(

13

self,

14

catalog: str = None,

15

schema: str = None,

16

table: str = None,

17

types: list[str] = None

18

) -> tuple:

19

"""

20

Retrieve table metadata from the database.

21

22

Parameters:

23

- catalog: Catalog name pattern (None for all catalogs)

24

- schema: Schema name pattern (None for all schemas)

25

- table: Table name pattern (None for all tables)

26

- types: List of table types to include (e.g., ['TABLE', 'VIEW'])

27

28

Returns:

29

Tuple of tuples, each containing:

30

(table_cat, table_schem, table_name, table_type, remarks,

31

type_cat, type_schem, type_name, self_referencing_col_name,

32

ref_generation)

33

34

Common table types: 'TABLE', 'VIEW', 'SYSTEM TABLE', 'GLOBAL TEMPORARY',

35

'LOCAL TEMPORARY', 'ALIAS', 'SYNONYM'

36

"""

37

```

38

39

### Column Metadata

40

41

Detailed information about table columns including data types, nullability, defaults, and constraints.

42

43

```python { .api }

44

class Cursor:

45

def get_columns(

46

self,

47

catalog: str = None,

48

schema: str = None,

49

table: str = None,

50

column: str = None

51

) -> tuple:

52

"""

53

Retrieve column metadata from database tables.

54

55

Parameters:

56

- catalog: Catalog name pattern (None for all catalogs)

57

- schema: Schema name pattern (None for all schemas)

58

- table: Table name pattern (None for all tables)

59

- column: Column name pattern (None for all columns)

60

61

Returns:

62

Tuple of tuples, each containing:

63

(table_cat, table_schem, table_name, column_name, data_type,

64

type_name, column_size, buffer_length, decimal_digits, num_prec_radix,

65

nullable, remarks, column_def, sql_data_type, sql_datetime_sub,

66

char_octet_length, ordinal_position, is_nullable, scope_catalog,

67

scope_schema, scope_table, source_data_type, is_autoincrement, is_generatedcolumn)

68

"""

69

```

70

71

### Primary Key Information

72

73

Retrieve primary key constraints and their constituent columns for tables.

74

75

```python { .api }

76

class Cursor:

77

def get_primary_keys(

78

self,

79

catalog: str = None,

80

schema: str = None,

81

table: str = None

82

) -> tuple:

83

"""

84

Retrieve primary key information for tables.

85

86

Parameters:

87

- catalog: Catalog name (None for current catalog)

88

- schema: Schema name (None for current schema)

89

- table: Table name (None for all tables)

90

91

Returns:

92

Tuple of tuples, each containing:

93

(table_cat, table_schem, table_name, column_name, key_seq, pk_name)

94

95

Results are ordered by table_cat, table_schem, table_name, key_seq

96

"""

97

```

98

99

### Stored Procedure Metadata

100

101

Information about stored procedures and functions available in the database.

102

103

```python { .api }

104

class Cursor:

105

def get_procedures(

106

self,

107

catalog: str = None,

108

schema: str = None,

109

procedure: str = None

110

) -> tuple:

111

"""

112

Retrieve stored procedure metadata.

113

114

Parameters:

115

- catalog: Catalog name pattern (None for all catalogs)

116

- schema: Schema name pattern (None for all schemas)

117

- procedure: Procedure name pattern (None for all procedures)

118

119

Returns:

120

Tuple of tuples, each containing:

121

(procedure_cat, procedure_schem, procedure_name, num_input_params,

122

num_output_params, num_result_sets, remarks, procedure_type,

123

specific_name)

124

"""

125

```

126

127

### Schema Information

128

129

Retrieve information about available schemas (namespaces) in the database.

130

131

```python { .api }

132

class Cursor:

133

def get_schemas(

134

self,

135

catalog: str = None,

136

schema: str = None

137

) -> tuple:

138

"""

139

Retrieve schema metadata.

140

141

Parameters:

142

- catalog: Catalog name (None for current catalog)

143

- schema: Schema name pattern (None for all schemas)

144

145

Returns:

146

Tuple of tuples, each containing:

147

(table_schem, table_catalog)

148

"""

149

```

150

151

### Catalog Information

152

153

Retrieve information about available catalogs (databases) accessible to the current connection.

154

155

```python { .api }

156

class Cursor:

157

def get_catalogs(self) -> tuple:

158

"""

159

Retrieve catalog (database) metadata.

160

161

Returns:

162

Tuple of tuples, each containing:

163

(table_cat,)

164

165

Lists all catalogs (databases) available to the current user.

166

"""

167

```

168

169

### Metadata Usage Examples

170

171

Practical examples demonstrating metadata introspection for common use cases.

172

173

```python

174

import redshift_connector

175

176

conn = redshift_connector.connect(

177

host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',

178

database='dev',

179

user='awsuser',

180

password='password'

181

)

182

183

cursor = conn.cursor()

184

185

# List all tables in the current database

186

tables = cursor.get_tables()

187

print("Available tables:")

188

for table in tables:

189

catalog, schema, table_name, table_type = table[:4]

190

print(f" {schema}.{table_name} ({table_type})")

191

192

# List tables in a specific schema

193

sales_tables = cursor.get_tables(schema='sales')

194

print("\nTables in 'sales' schema:")

195

for table in sales_tables:

196

print(f" {table[2]} ({table[3]})") # table_name, table_type

197

198

# Get detailed column information for a specific table

199

columns = cursor.get_columns(schema='sales', table='orders')

200

print("\nColumns in 'sales.orders' table:")

201

for col in columns:

202

schema_name, table_name, col_name, data_type, type_name, col_size, nullable = col[1], col[2], col[3], col[4], col[5], col[6], col[10]

203

null_str = "NULL" if nullable else "NOT NULL"

204

print(f" {col_name}: {type_name}({col_size}) {null_str}")

205

206

# Find primary keys for a table

207

pk_info = cursor.get_primary_keys(schema='sales', table='orders')

208

if pk_info:

209

print(f"\nPrimary key for 'sales.orders':")

210

pk_columns = [col[3] for col in sorted(pk_info, key=lambda x: x[4])] # Sort by key_seq

211

print(f" Columns: {', '.join(pk_columns)}")

212

print(f" Constraint name: {pk_info[0][5]}")

213

214

# List all schemas

215

schemas = cursor.get_schemas()

216

print(f"\nAvailable schemas:")

217

for schema in schemas:

218

print(f" {schema[0]}")

219

220

# List all catalogs (databases)

221

catalogs = cursor.get_catalogs()

222

print(f"\nAvailable catalogs:")

223

for catalog in catalogs:

224

print(f" {catalog[0]}")

225

226

# Find tables matching a pattern

227

user_tables = cursor.get_tables(table='user%') # Tables starting with 'user'

228

print(f"\nTables matching 'user%' pattern:")

229

for table in user_tables:

230

print(f" {table[1]}.{table[2]}")

231

232

# Get stored procedures

233

procedures = cursor.get_procedures()

234

if procedures:

235

print(f"\nAvailable procedures:")

236

for proc in procedures:

237

schema, proc_name, proc_type = proc[1], proc[2], proc[7]

238

print(f" {schema}.{proc_name} (type: {proc_type})")

239

else:

240

print("\nNo stored procedures found")

241

242

cursor.close()

243

conn.close()

244

```

245

246

### Dynamic Schema Discovery

247

248

Advanced patterns for dynamically discovering and working with database schemas.

249

250

```python

251

def discover_table_structure(cursor, schema_name, table_name):

252

"""

253

Discover complete table structure including columns, primary keys, and metadata.

254

"""

255

# Get table information

256

tables = cursor.get_tables(schema=schema_name, table=table_name)

257

if not tables:

258

raise ValueError(f"Table {schema_name}.{table_name} not found")

259

260

table_info = tables[0]

261

table_type = table_info[3]

262

263

# Get column information

264

columns = cursor.get_columns(schema=schema_name, table=table_name)

265

column_info = []

266

267

for col in columns:

268

column_info.append({

269

'name': col[3],

270

'data_type': col[4],

271

'type_name': col[5],

272

'column_size': col[6],

273

'decimal_digits': col[8],

274

'nullable': col[10] == 1,

275

'default': col[12],

276

'ordinal_position': col[16],

277

'is_autoincrement': col[22] == 'YES',

278

})

279

280

# Get primary key information

281

pk_cols = cursor.get_primary_keys(schema=schema_name, table=table_name)

282

primary_key = []

283

if pk_cols:

284

primary_key = [col[3] for col in sorted(pk_cols, key=lambda x: x[4])]

285

286

return {

287

'schema': schema_name,

288

'table': table_name,

289

'table_type': table_type,

290

'columns': column_info,

291

'primary_key': primary_key

292

}

293

294

def generate_create_table_ddl(cursor, schema_name, table_name):

295

"""

296

Generate CREATE TABLE DDL statement based on table metadata.

297

"""

298

structure = discover_table_structure(cursor, schema_name, table_name)

299

300

ddl = f"CREATE TABLE {schema_name}.{table_name} (\n"

301

302

column_defs = []

303

for col in structure['columns']:

304

col_def = f" {col['name']} {col['type_name']}"

305

306

if col['column_size']:

307

col_def += f"({col['column_size']}"

308

if col['decimal_digits']:

309

col_def += f", {col['decimal_digits']}"

310

col_def += ")"

311

312

if not col['nullable']:

313

col_def += " NOT NULL"

314

315

if col['default']:

316

col_def += f" DEFAULT {col['default']}"

317

318

column_defs.append(col_def)

319

320

ddl += ",\n".join(column_defs)

321

322

if structure['primary_key']:

323

pk_def = f" PRIMARY KEY ({', '.join(structure['primary_key'])})"

324

ddl += f",\n{pk_def}"

325

326

ddl += "\n);"

327

328

return ddl

329

330

# Usage example

331

cursor = conn.cursor()

332

try:

333

# Discover table structure

334

structure = discover_table_structure(cursor, 'public', 'users')

335

print(f"Table: {structure['schema']}.{structure['table']}")

336

print(f"Type: {structure['table_type']}")

337

print(f"Columns: {len(structure['columns'])}")

338

print(f"Primary Key: {structure['primary_key']}")

339

340

# Generate DDL

341

ddl = generate_create_table_ddl(cursor, 'public', 'users')

342

print(f"\nGenerated DDL:\n{ddl}")

343

344

except ValueError as e:

345

print(f"Error: {e}")

346

```

347

348

### Cross-Database Metadata Operations

349

350

Working with metadata across multiple databases and schemas in Redshift.

351

352

```python { .api }

353

# Database metadata scope configuration

354

conn = redshift_connector.connect(

355

# ... connection parameters

356

database_metadata_current_db_only=True # Limit metadata to current database (default)

357

# database_metadata_current_db_only=False # Include cross-database metadata (datashare support)

358

)

359

360

def list_all_databases_and_schemas(cursor):

361

"""

362

List all accessible databases and their schemas.

363

"""

364

catalogs = cursor.get_catalogs()

365

366

for catalog in catalogs:

367

catalog_name = catalog[0]

368

print(f"Database: {catalog_name}")

369

370

# Get schemas in this catalog

371

schemas = cursor.get_schemas(catalog=catalog_name)

372

for schema in schemas:

373

schema_name = schema[0]

374

print(f" Schema: {schema_name}")

375

376

# Get table count in schema

377

tables = cursor.get_tables(catalog=catalog_name, schema=schema_name)

378

print(f" Tables: {len(tables)}")

379

380

def find_tables_by_column(cursor, column_name):

381

"""

382

Find all tables that contain a specific column name.

383

"""

384

columns = cursor.get_columns(column=column_name)

385

386

tables_with_column = set()

387

for col in columns:

388

table_key = (col[0], col[1], col[2]) # catalog, schema, table

389

tables_with_column.add(table_key)

390

391

return list(tables_with_column)

392

393

# Example usage

394

cursor = conn.cursor()

395

396

# Find all tables with a 'user_id' column

397

tables_with_user_id = find_tables_by_column(cursor, 'user_id')

398

print("Tables containing 'user_id' column:")

399

for catalog, schema, table in tables_with_user_id:

400

print(f" {catalog}.{schema}.{table}")

401

```

402

403

### Metadata Caching and Performance

404

405

Strategies for efficient metadata operations and caching for better performance.

406

407

```python

408

class MetadataCache:

409

"""Simple metadata cache for improved performance."""

410

411

def __init__(self, cursor):

412

self.cursor = cursor

413

self._table_cache = {}

414

self._column_cache = {}

415

416

def get_table_info(self, schema, table):

417

"""Get cached table information."""

418

key = (schema, table)

419

if key not in self._table_cache:

420

tables = self.cursor.get_tables(schema=schema, table=table)

421

self._table_cache[key] = tables[0] if tables else None

422

return self._table_cache[key]

423

424

def get_column_info(self, schema, table):

425

"""Get cached column information."""

426

key = (schema, table)

427

if key not in self._column_cache:

428

columns = self.cursor.get_columns(schema=schema, table=table)

429

self._column_cache[key] = columns

430

return self._column_cache[key]

431

432

def clear_cache(self):

433

"""Clear the metadata cache."""

434

self._table_cache.clear()

435

self._column_cache.clear()

436

437

# Usage example

438

cursor = conn.cursor()

439

metadata_cache = MetadataCache(cursor)

440

441

# First access - queries database

442

table_info = metadata_cache.get_table_info('public', 'users')

443

columns = metadata_cache.get_column_info('public', 'users')

444

445

# Subsequent accesses - uses cache

446

table_info2 = metadata_cache.get_table_info('public', 'users') # From cache

447

columns2 = metadata_cache.get_column_info('public', 'users') # From cache

448

```