0
# Amazon Redshift Python Connector
1
2
A pure Python connector for Amazon Redshift that implements the Python Database API Specification 2.0. This library provides seamless integration with popular data science libraries like pandas and numpy, while supporting Redshift-specific features including IAM authentication, Identity provider (IdP) authentication, and Redshift-specific data types. The library is designed for high-performance database connectivity with comprehensive support for Python versions 3.6 through 3.11, making it suitable for data analytics workflows, ETL processes, business intelligence applications, and any Python application requiring direct access to Amazon Redshift databases with enterprise-grade authentication and data type handling capabilities.
3
4
## Package Information
5
6
- **Package Name**: redshift_connector
7
- **Language**: Python
8
- **Installation**: `pip install redshift_connector`
9
- **Optional Dependencies**: `pip install redshift_connector[full]` (includes pandas and numpy support)
10
11
## Core Imports
12
13
```python
14
import redshift_connector
15
```
16
17
Common patterns for working with the connector:
18
19
```python
20
from redshift_connector import connect, Connection, Cursor
21
from redshift_connector import Error, InterfaceError, ProgrammingError
22
```
23
24
## Basic Usage
25
26
```python
27
import redshift_connector
28
29
# Basic connection with username/password
30
conn = redshift_connector.connect(
31
host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com',
32
database='dev',
33
user='awsuser',
34
password='my_password'
35
)
36
37
cursor = conn.cursor()
38
cursor.execute("CREATE TEMP TABLE book(bookname varchar, author varchar)")
39
cursor.executemany("INSERT INTO book (bookname, author) VALUES (%s, %s)",
40
[('One Hundred Years of Solitude', 'Gabriel García Márquez'),
41
('A Brief History of Time', 'Stephen Hawking')])
42
43
cursor.execute("SELECT * FROM book")
44
result = cursor.fetchall()
45
print(result)
46
47
# Clean up
48
cursor.close()
49
conn.close()
50
```
51
52
## Architecture
53
54
The redshift_connector follows the Python Database API Specification 2.0 architecture:
55
56
- **Connection Factory**: The `connect()` function creates Connection instances with comprehensive authentication options
57
- **Connection**: Database connection management with transaction control, prepared statements, and two-phase commit support
58
- **Cursor**: Query execution interface with result fetching, metadata operations, and data science integration
59
- **Authentication Layer**: Pluggable authentication system supporting 18+ identity providers and authentication methods
60
- **Type System**: Comprehensive PostgreSQL/Redshift data type support with Python object mapping
61
- **Error Hierarchy**: Complete DB-API 2.0 exception hierarchy for robust error handling
62
63
This design enables the connector to serve as a comprehensive database access layer for Python applications requiring enterprise-grade Redshift connectivity.
64
65
## Capabilities
66
67
### Core Database Operations
68
69
Essential database connectivity functionality including connection establishment, query execution, result fetching, and transaction management. This forms the foundation of the DB-API 2.0 interface.
70
71
```python { .api }
72
def connect(
73
user: str = None,
74
database: str = None,
75
password: str = None,
76
host: str = None,
77
port: int = None,
78
# ... 60+ additional parameters
79
) -> Connection: ...
80
81
class Connection:
82
def cursor(self) -> Cursor: ...
83
def commit(self) -> None: ...
84
def rollback(self) -> None: ...
85
def close(self) -> None: ...
86
87
class Cursor:
88
def execute(self, operation: str, args=None) -> 'Cursor': ...
89
def fetchone(self) -> list | None: ...
90
def fetchmany(self, num: int = None) -> tuple: ...
91
def fetchall(self) -> tuple: ...
92
```
93
94
[Core Database Operations](./core-database.md)
95
96
### Authentication and Security
97
98
Comprehensive authentication system supporting multiple identity providers, IAM roles, and security protocols. Includes support for SAML, OAuth2, JWT, and browser-based authentication flows.
99
100
```python { .api }
101
# IAM Authentication
102
conn = redshift_connector.connect(
103
iam=True,
104
cluster_identifier='my-cluster',
105
db_user='myuser',
106
# AWS credentials via profile, keys, or instance roles
107
)
108
109
# Identity Provider Authentication
110
conn = redshift_connector.connect(
111
credentials_provider='AdfsCredentialsProvider',
112
idp_host='example.com',
113
# Additional IdP-specific parameters
114
)
115
```
116
117
[Authentication and Security](./authentication.md)
118
119
### Data Science Integration
120
121
Native integration with pandas and numpy for efficient data transfer between Redshift and Python data science workflows. Supports DataFrame I/O and numpy array operations.
122
123
```python { .api }
124
class Cursor:
125
def fetch_dataframe(self, num: int = None) -> 'pandas.DataFrame': ...
126
def write_dataframe(self, df: 'pandas.DataFrame', table: str) -> None: ...
127
def fetch_numpy_array(self, num: int = None) -> 'numpy.ndarray': ...
128
```
129
130
[Data Science Integration](./data-science.md)
131
132
### Error Handling and Exceptions
133
134
Complete DB-API 2.0 exception hierarchy providing structured error handling for different types of database and interface errors.
135
136
```python { .api }
137
class Error(Exception): ...
138
class InterfaceError(Error): ...
139
class DatabaseError(Error): ...
140
class ProgrammingError(DatabaseError): ...
141
class OperationalError(DatabaseError): ...
142
# Additional exception classes...
143
```
144
145
[Error Handling and Exceptions](./error-handling.md)
146
147
### Data Types and Type Conversion
148
149
Comprehensive support for PostgreSQL and Redshift data types with Python object mapping, including arrays, JSON, geometric types, and date/time handling.
150
151
```python { .api }
152
# DB-API 2.0 Type Constructors
153
def Date(year: int, month: int, day: int) -> date: ...
154
def Time(hour: int, minute: int, second: int) -> time: ...
155
def Timestamp(year: int, month: int, day: int, hour: int, minute: int, second: int) -> datetime: ...
156
def Binary(value: bytes) -> bytes: ...
157
158
# PostgreSQL Type Classes
159
class PGJson: ...
160
class PGJsonb: ...
161
class PGEnum: ...
162
```
163
164
[Data Types and Type Conversion](./data-types.md)
165
166
### Database Metadata and Introspection
167
168
Database schema introspection capabilities for retrieving metadata about tables, columns, procedures, and other database objects.
169
170
```python { .api }
171
class Cursor:
172
def get_tables(self, catalog: str = None, schema: str = None, table: str = None, types: list = None) -> tuple: ...
173
def get_columns(self, catalog: str = None, schema: str = None, table: str = None, column: str = None) -> tuple: ...
174
def get_primary_keys(self, catalog: str = None, schema: str = None, table: str = None) -> tuple: ...
175
def get_procedures(self, catalog: str = None, schema: str = None, procedure: str = None) -> tuple: ...
176
```
177
178
[Database Metadata and Introspection](./metadata.md)
179
180
## Module Constants
181
182
```python { .api }
183
# DB-API 2.0 Constants
184
apilevel: str = "2.0"
185
threadsafety: int = 1
186
paramstyle: str = "format"
187
188
# Protocol and Configuration
189
DEFAULT_PROTOCOL_VERSION: int = 2
190
class ClientProtocolVersion(IntEnum):
191
BASE_SERVER = 0
192
EXTENDED_RESULT_METADATA = 1
193
BINARY = 2
194
195
class DbApiParamstyle(Enum):
196
QMARK = "qmark"
197
NUMERIC = "numeric"
198
NAMED = "named"
199
FORMAT = "format"
200
PYFORMAT = "pyformat"
201
```