or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

authentication.mdcore-database.mddata-science.mddata-types.mderror-handling.mdindex.mdmetadata.md

index.mddocs/

0

# Amazon Redshift Python Connector

1

2

A pure Python connector for Amazon Redshift that implements the Python Database API Specification 2.0. This library provides seamless integration with popular data science libraries like pandas and numpy, while supporting Redshift-specific features including IAM authentication, Identity provider (IdP) authentication, and Redshift-specific data types. The library is designed for high-performance database connectivity with comprehensive support for Python versions 3.6 through 3.11, making it suitable for data analytics workflows, ETL processes, business intelligence applications, and any Python application requiring direct access to Amazon Redshift databases with enterprise-grade authentication and data type handling capabilities.

3

4

## Package Information

5

6

- **Package Name**: redshift_connector

7

- **Language**: Python

8

- **Installation**: `pip install redshift_connector`

9

- **Optional Dependencies**: `pip install redshift_connector[full]` (includes pandas and numpy support)

10

11

## Core Imports

12

13

```python

14

import redshift_connector

15

```

16

17

Common patterns for working with the connector:

18

19

```python

20

from redshift_connector import connect, Connection, Cursor

21

from redshift_connector import Error, InterfaceError, ProgrammingError

22

```

23

24

## Basic Usage

25

26

```python

27

import redshift_connector

28

29

# Basic connection with username/password

30

conn = redshift_connector.connect(

31

host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',

32

database='dev',

33

user='awsuser',

34

password='my_password'

35

)

36

37

cursor = conn.cursor()

38

cursor.execute("CREATE TEMP TABLE book(bookname varchar, author varchar)")

39

cursor.executemany("INSERT INTO book (bookname, author) VALUES (%s, %s)",

40

[('One Hundred Years of Solitude', 'Gabriel García Márquez'),

41

('A Brief History of Time', 'Stephen Hawking')])

42

43

cursor.execute("SELECT * FROM book")

44

result = cursor.fetchall()

45

print(result)

46

47

# Clean up

48

cursor.close()

49

conn.close()

50

```

51

52

## Architecture

53

54

The redshift_connector follows the Python Database API Specification 2.0 architecture:

55

56

- **Connection Factory**: The `connect()` function creates Connection instances with comprehensive authentication options

57

- **Connection**: Database connection management with transaction control, prepared statements, and two-phase commit support

58

- **Cursor**: Query execution interface with result fetching, metadata operations, and data science integration

59

- **Authentication Layer**: Pluggable authentication system supporting 18+ identity providers and authentication methods

60

- **Type System**: Comprehensive PostgreSQL/Redshift data type support with Python object mapping

61

- **Error Hierarchy**: Complete DB-API 2.0 exception hierarchy for robust error handling

62

63

This design enables the connector to serve as a comprehensive database access layer for Python applications requiring enterprise-grade Redshift connectivity.

64

65

## Capabilities

66

67

### Core Database Operations

68

69

Essential database connectivity functionality including connection establishment, query execution, result fetching, and transaction management. This forms the foundation of the DB-API 2.0 interface.

70

71

```python { .api }

72

def connect(

73

user: str = None,

74

database: str = None,

75

password: str = None,

76

host: str = None,

77

port: int = None,

78

# ... 60+ additional parameters

79

) -> Connection: ...

80

81

class Connection:

82

def cursor(self) -> Cursor: ...

83

def commit(self) -> None: ...

84

def rollback(self) -> None: ...

85

def close(self) -> None: ...

86

87

class Cursor:

88

def execute(self, operation: str, args=None) -> 'Cursor': ...

89

def fetchone(self) -> list | None: ...

90

def fetchmany(self, num: int = None) -> tuple: ...

91

def fetchall(self) -> tuple: ...

92

```

93

94

[Core Database Operations](./core-database.md)

95

96

### Authentication and Security

97

98

Comprehensive authentication system supporting multiple identity providers, IAM roles, and security protocols. Includes support for SAML, OAuth2, JWT, and browser-based authentication flows.

99

100

```python { .api }

101

# IAM Authentication

102

conn = redshift_connector.connect(

103

iam=True,

104

cluster_identifier='my-cluster',

105

db_user='myuser',

106

# AWS credentials via profile, keys, or instance roles

107

)

108

109

# Identity Provider Authentication

110

conn = redshift_connector.connect(

111

credentials_provider='AdfsCredentialsProvider',

112

idp_host='example.com',

113

# Additional IdP-specific parameters

114

)

115

```

116

117

[Authentication and Security](./authentication.md)

118

119

### Data Science Integration

120

121

Native integration with pandas and numpy for efficient data transfer between Redshift and Python data science workflows. Supports DataFrame I/O and numpy array operations.

122

123

```python { .api }

124

class Cursor:

125

def fetch_dataframe(self, num: int = None) -> 'pandas.DataFrame': ...

126

def write_dataframe(self, df: 'pandas.DataFrame', table: str) -> None: ...

127

def fetch_numpy_array(self, num: int = None) -> 'numpy.ndarray': ...

128

```

129

130

[Data Science Integration](./data-science.md)

131

132

### Error Handling and Exceptions

133

134

Complete DB-API 2.0 exception hierarchy providing structured error handling for different types of database and interface errors.

135

136

```python { .api }

137

class Error(Exception): ...

138

class InterfaceError(Error): ...

139

class DatabaseError(Error): ...

140

class ProgrammingError(DatabaseError): ...

141

class OperationalError(DatabaseError): ...

142

# Additional exception classes...

143

```

144

145

[Error Handling and Exceptions](./error-handling.md)

146

147

### Data Types and Type Conversion

148

149

Comprehensive support for PostgreSQL and Redshift data types with Python object mapping, including arrays, JSON, geometric types, and date/time handling.

150

151

```python { .api }

152

# DB-API 2.0 Type Constructors

153

def Date(year: int, month: int, day: int) -> date: ...

154

def Time(hour: int, minute: int, second: int) -> time: ...

155

def Timestamp(year: int, month: int, day: int, hour: int, minute: int, second: int) -> datetime: ...

156

def Binary(value: bytes) -> bytes: ...

157

158

# PostgreSQL Type Classes

159

class PGJson: ...

160

class PGJsonb: ...

161

class PGEnum: ...

162

```

163

164

[Data Types and Type Conversion](./data-types.md)

165

166

### Database Metadata and Introspection

167

168

Database schema introspection capabilities for retrieving metadata about tables, columns, procedures, and other database objects.

169

170

```python { .api }

171

class Cursor:

172

def get_tables(self, catalog: str = None, schema: str = None, table: str = None, types: list = None) -> tuple: ...

173

def get_columns(self, catalog: str = None, schema: str = None, table: str = None, column: str = None) -> tuple: ...

174

def get_primary_keys(self, catalog: str = None, schema: str = None, table: str = None) -> tuple: ...

175

def get_procedures(self, catalog: str = None, schema: str = None, procedure: str = None) -> tuple: ...

176

```

177

178

[Database Metadata and Introspection](./metadata.md)

179

180

## Module Constants

181

182

```python { .api }

183

# DB-API 2.0 Constants

184

apilevel: str = "2.0"

185

threadsafety: int = 1

186

paramstyle: str = "format"

187

188

# Protocol and Configuration

189

DEFAULT_PROTOCOL_VERSION: int = 2

190

class ClientProtocolVersion(IntEnum):

191

BASE_SERVER = 0

192

EXTENDED_RESULT_METADATA = 1

193

BINARY = 2

194

195

class DbApiParamstyle(Enum):

196

QMARK = "qmark"

197

NUMERIC = "numeric"

198

NAMED = "named"

199

FORMAT = "format"

200

PYFORMAT = "pyformat"

201

```