0
# SQLAlchemy Integration
1
2
Multiple SQLAlchemy dialects for seamless integration with ORMs and SQL frameworks, supporting different execution modes and safety levels. Enables standard SQLAlchemy operations against diverse data sources through shillelagh adapters.
3
4
## Capabilities
5
6
### Available Dialects
7
8
Multiple SQLAlchemy dialects supporting different use cases and execution environments.
9
10
```python { .api }
11
# Available SQLAlchemy dialects:
12
# - shillelagh: Default APSW dialect
13
# - shillelagh.apsw: Explicit APSW dialect
14
# - shillelagh.safe: Safe mode dialect (restricted SQL)
15
# - gsheets: Google Sheets specific dialect
16
# - metricflow: dbt MetricFlow integration
17
# - shillelagh.multicorn2: PostgreSQL Multicorn integration
18
# - shillelagh.sqlglot: SQLGlot backend integration
19
```
20
21
### Engine Creation
22
23
Create SQLAlchemy engines with shillelagh dialects for ORM and Core operations.
24
25
```python { .api }
26
from sqlalchemy import create_engine
27
28
# Default shillelagh engine
29
engine = create_engine("shillelagh://")
30
31
# Safe mode engine (restricted SQL operations)
32
safe_engine = create_engine("shillelagh.safe://")
33
34
# Google Sheets specific engine
35
gsheets_engine = create_engine("gsheets://")
36
37
# dbt MetricFlow engine
38
metricflow_engine = create_engine("metricflow://")
39
```
40
41
### Dialect Configuration
42
43
Configure dialects with adapter-specific settings and connection parameters.
44
45
```python { .api }
46
from sqlalchemy import create_engine
47
48
# Engine with adapter configuration
49
engine = create_engine(
50
"shillelagh://",
51
connect_args={
52
'adapters': ['GSheetsAPI', 'CSVFile', 'GitHubAPI'],
53
'adapter_kwargs': {
54
'gsheetsapi': {
55
'service_account_file': '/path/to/credentials.json',
56
'subject': 'user@example.com'
57
},
58
'githubapi': {
59
'token': 'ghp_your_token_here'
60
}
61
}
62
}
63
)
64
```
65
66
### ORM Integration
67
68
Use shillelagh with SQLAlchemy ORM for object-relational mapping against data sources.
69
70
```python { .api }
71
from sqlalchemy import Column, Integer, String, create_engine
72
from sqlalchemy.ext.declarative import declarative_base
73
from sqlalchemy.orm import sessionmaker
74
75
Base = declarative_base()
76
77
class DataModel(Base):
78
"""SQLAlchemy model for external data source."""
79
80
__tablename__ = 'https://example.com/data.csv'
81
82
id = Column(Integer, primary_key=True)
83
name = Column(String)
84
value = Column(Integer)
85
86
# Create engine and session
87
engine = create_engine("shillelagh://")
88
Session = sessionmaker(bind=engine)
89
session = Session()
90
91
# Query using ORM
92
results = session.query(DataModel).filter(DataModel.value > 100).all()
93
```
94
95
### Core SQL Expression
96
97
Use SQLAlchemy Core for programmatic SQL construction with shillelagh backends.
98
99
```python { .api }
100
from sqlalchemy import create_engine, MetaData, Table, Column, Integer, String, select
101
102
# Create engine
103
engine = create_engine("shillelagh://")
104
105
# Define table schema
106
metadata = MetaData()
107
data_table = Table(
108
'https://api.example.com/data.json',
109
metadata,
110
Column('id', Integer),
111
Column('name', String),
112
Column('category', String),
113
Column('score', Integer)
114
)
115
116
# Build and execute queries
117
with engine.connect() as conn:
118
# Select with filtering
119
stmt = select(data_table).where(data_table.c.score > 80)
120
results = conn.execute(stmt)
121
122
for row in results:
123
print(f"{row.name}: {row.score}")
124
```
125
126
## Dialect-Specific Features
127
128
### Default Shillelagh Dialect
129
130
Full-featured dialect supporting all shillelagh adapters and SQL operations.
131
132
```python { .api }
133
from sqlalchemy import create_engine
134
135
# Default dialect with all features
136
engine = create_engine("shillelagh://")
137
138
# Supports:
139
# - All SQL operations (SELECT, INSERT, UPDATE, DELETE)
140
# - All adapter types (API, file, memory)
141
# - Custom functions and expressions
142
# - Transaction support
143
```
144
145
### Safe Mode Dialect
146
147
Restricted dialect for secure environments with limited SQL operations.
148
149
```python { .api }
150
from sqlalchemy import create_engine
151
152
# Safe mode dialect
153
safe_engine = create_engine("shillelagh.safe://")
154
155
# Restrictions:
156
# - Read-only operations (SELECT only)
157
# - No DDL operations (CREATE, DROP, ALTER)
158
# - Limited function support
159
# - Enhanced security validations
160
```
161
162
Usage example:
163
164
```python
165
from sqlalchemy import create_engine, text
166
167
safe_engine = create_engine("shillelagh.safe://")
168
169
with safe_engine.connect() as conn:
170
# Allowed: SELECT operations
171
result = conn.execute(text("SELECT * FROM 'data.csv' WHERE id < 100"))
172
173
# Blocked: INSERT operations (would raise exception)
174
# conn.execute(text("INSERT INTO 'data.csv' VALUES (1, 'test')"))
175
```
176
177
### Google Sheets Dialect
178
179
Specialized dialect optimized for Google Sheets integration with enhanced features.
180
181
```python { .api }
182
from sqlalchemy import create_engine
183
184
# Google Sheets specific dialect
185
gsheets_engine = create_engine("gsheets://")
186
187
# Features:
188
# - Optimized for Google Sheets API
189
# - Built-in authentication handling
190
# - Sheet-specific SQL extensions
191
# - Range-based queries
192
```
193
194
### dbt MetricFlow Dialect
195
196
Integration with dbt MetricFlow semantic layer for metrics and dimensions.
197
198
```python { .api }
199
from sqlalchemy import create_engine
200
201
# dbt MetricFlow dialect
202
metricflow_engine = create_engine("metricflow://")
203
204
# Features:
205
# - Semantic layer integration
206
# - Metric definitions and calculations
207
# - Dimension queries
208
# - Time-based aggregations
209
```
210
211
## Usage Examples
212
213
### Basic SQLAlchemy Core Usage
214
215
```python
216
from sqlalchemy import create_engine, text
217
218
# Create shillelagh engine
219
engine = create_engine("shillelagh://")
220
221
# Execute raw SQL
222
with engine.connect() as conn:
223
result = conn.execute(text("""
224
SELECT country, AVG(population) as avg_pop
225
FROM 'https://example.com/countries.csv'
226
GROUP BY country
227
HAVING avg_pop > 1000000
228
ORDER BY avg_pop DESC
229
"""))
230
231
for row in result:
232
print(f"{row.country}: {row.avg_pop:,.0f}")
233
```
234
235
### ORM with External Data Sources
236
237
```python
238
from sqlalchemy import Column, Integer, String, Float, create_engine
239
from sqlalchemy.ext.declarative import declarative_base
240
from sqlalchemy.orm import sessionmaker
241
242
Base = declarative_base()
243
244
class WeatherData(Base):
245
"""Model for weather API data."""
246
247
__tablename__ = 'https://api.weather.com/current'
248
249
city = Column(String, primary_key=True)
250
temperature = Column(Float)
251
humidity = Column(Integer)
252
pressure = Column(Float)
253
254
# Configure engine with API credentials
255
engine = create_engine(
256
"shillelagh://",
257
connect_args={
258
'adapter_kwargs': {
259
'weatherapi': {'api_key': 'your_weather_api_key'}
260
}
261
}
262
)
263
264
# Create session and query
265
Session = sessionmaker(bind=engine)
266
session = Session()
267
268
# Find hot, humid cities
269
hot_humid = session.query(WeatherData)\
270
.filter(WeatherData.temperature > 30)\
271
.filter(WeatherData.humidity > 80)\
272
.order_by(WeatherData.temperature.desc())\
273
.all()
274
275
for city in hot_humid:
276
print(f"{city.city}: {city.temperature}°C, {city.humidity}% humidity")
277
```
278
279
### Multi-Source Joins
280
281
```python
282
from sqlalchemy import create_engine, MetaData, Table, Column, Integer, String, select
283
284
engine = create_engine("shillelagh://")
285
metadata = MetaData()
286
287
# Define tables from different sources
288
users = Table(
289
'https://api.example.com/users',
290
metadata,
291
Column('id', Integer),
292
Column('name', String),
293
Column('department_id', Integer)
294
)
295
296
departments = Table(
297
'./data/departments.csv',
298
metadata,
299
Column('id', Integer),
300
Column('name', String),
301
Column('budget', Integer)
302
)
303
304
# Join data from API and CSV file
305
with engine.connect() as conn:
306
stmt = select(
307
users.c.name.label('user_name'),
308
departments.c.name.label('dept_name'),
309
departments.c.budget
310
).select_from(
311
users.join(departments, users.c.department_id == departments.c.id)
312
).where(departments.c.budget > 100000)
313
314
results = conn.execute(stmt)
315
316
for row in results:
317
print(f"{row.user_name} works in {row.dept_name} (Budget: ${row.budget:,})")
318
```
319
320
### Pandas Integration
321
322
```python
323
import pandas as pd
324
from sqlalchemy import create_engine
325
326
# Create engine
327
engine = create_engine("shillelagh://")
328
329
# Query directly to pandas DataFrame
330
df = pd.read_sql_query("""
331
SELECT
332
DATE(timestamp) as date,
333
COUNT(*) as events,
334
AVG(value) as avg_value
335
FROM 'https://api.metrics.com/events'
336
WHERE timestamp >= '2023-01-01'
337
GROUP BY DATE(timestamp)
338
ORDER BY date
339
""", engine)
340
341
print(df.head())
342
343
# Use DataFrame for further analysis
344
daily_stats = df.groupby('date').agg({
345
'events': 'sum',
346
'avg_value': 'mean'
347
}).reset_index()
348
```