0
# Shillelagh
1
2
Shillelagh is a comprehensive Python library that enables querying diverse data sources using standard SQL syntax. Built on SQLite with APSW, it implements the Python DB API 2.0 specification and provides SQLAlchemy dialects for seamless integration. The library features an extensible adapter system supporting multiple data sources including CSV files, Google Sheets, GitHub APIs, JSON/XML endpoints, HTML tables, Pandas DataFrames, S3 objects, and system metrics.
3
4
## Package Information
5
6
- **Package Name**: shillelagh
7
- **Language**: Python
8
- **Installation**: `pip install shillelagh`
9
10
## Core Imports
11
12
```python
13
import shillelagh
14
```
15
16
For database connections:
17
18
```python
19
from shillelagh.backends.apsw.db import connect
20
```
21
22
For SQLAlchemy integration:
23
24
```python
25
from sqlalchemy import create_engine
26
27
# Create engine with shillelagh dialect
28
engine = create_engine("shillelagh://")
29
```
30
31
## Basic Usage
32
33
```python
34
from shillelagh.backends.apsw.db import connect
35
36
# Connect to shillelagh (in-memory database)
37
connection = connect(":memory:")
38
cursor = connection.cursor()
39
40
# Query a CSV file
41
cursor.execute("""
42
SELECT * FROM "https://example.com/data.csv"
43
WHERE column1 > 100
44
ORDER BY column2
45
""")
46
47
# Fetch results
48
rows = cursor.fetchall()
49
for row in rows:
50
print(row)
51
52
connection.close()
53
```
54
55
## Architecture
56
57
Shillelagh's architecture follows a layered approach:
58
59
- **DB API Layer**: Standard Python database interface (Connection, Cursor)
60
- **Backend System**: Multiple SQL backends (APSW/SQLite, Multicorn, SQLGlot)
61
- **Adapter Framework**: Extensible plugin system for data sources
62
- **Type System**: Robust field types with automatic inference
63
- **Query Engine**: SQL parsing, optimization, and execution
64
65
This design allows SQL queries against any supported data source while maintaining compatibility with standard Python database tools and ORMs.
66
67
## Capabilities
68
69
### Database Connection and Querying
70
71
Standard Python DB API 2.0 interface for executing SQL queries against various data sources. Supports transactions, cursors, and all standard database operations.
72
73
```python { .api }
74
def connect(path, adapters=None, adapter_kwargs=None, safe=False, isolation_level=None, apsw_connection_kwargs=None, schema="main"):
75
"""Create a database connection."""
76
77
class Connection:
78
def cursor(self): ...
79
def execute(self, operation, parameters=None): ...
80
def close(self): ...
81
def commit(self): ...
82
def rollback(self): ...
83
84
class Cursor:
85
def execute(self, operation, parameters=None): ...
86
def fetchone(self): ...
87
def fetchmany(self, size=None): ...
88
def fetchall(self): ...
89
def close(self): ...
90
```
91
92
[Database API](./database-api.md)
93
94
### Data Source Adapters
95
96
Extensible system for connecting to various data sources including APIs, files, and in-memory data. Each adapter translates SQL operations to source-specific queries.
97
98
```python { .api }
99
class Adapter:
100
def supports(self, uri, fast=True, **kwargs): ...
101
def parse_uri(self, uri): ...
102
def get_columns(self): ...
103
def get_data(self, bounds=None, order=None, **kwargs): ...
104
```
105
106
[Adapters](./adapters.md)
107
108
### Type System and Field Definitions
109
110
Comprehensive type system for data conversion and validation with automatic type inference and custom field definitions.
111
112
```python { .api }
113
class Field:
114
def parse(self, value): ...
115
def format(self, value): ...
116
def quote(self, value): ...
117
118
class Boolean(Field): ...
119
class Integer(Field): ...
120
class Float(Field): ...
121
class String(Field): ...
122
class DateTime(Field): ...
123
```
124
125
[Type System](./type-system.md)
126
127
### SQLAlchemy Integration
128
129
Multiple SQLAlchemy dialects for integration with ORMs and SQL frameworks, supporting different execution modes and safety levels.
130
131
```python { .api }
132
# SQLAlchemy dialects available:
133
# - shillelagh (default APSW dialect)
134
# - shillelagh.safe (safe mode dialect)
135
# - gsheets (Google Sheets specific)
136
# - metricflow (dbt MetricFlow)
137
```
138
139
[SQLAlchemy Integration](./sqlalchemy-integration.md)
140
141
### Command Line Interface
142
143
Interactive SQL shell with syntax highlighting, completion, and configuration support for querying data sources from the command line.
144
145
```python { .api }
146
def main():
147
"""Main CLI entry point."""
148
```
149
150
[CLI Interface](./cli-interface.md)
151
152
### SQL Functions
153
154
Custom SQL functions available in queries for utility operations, metadata inspection, and data manipulation.
155
156
```python { .api }
157
def upgrade(target_version):
158
"""Upgrade shillelagh to specified version."""
159
160
def sleep(seconds):
161
"""Sleep for specified number of seconds."""
162
163
def get_metadata(uri):
164
"""Get metadata about a data source."""
165
166
def version():
167
"""Get current shillelagh version."""
168
169
def date_trunc(value, unit):
170
"""Truncate datetime to specified unit (year, quarter, month, week, day, hour, minute, second)."""
171
```
172
173
## Exception Hierarchy
174
175
```python { .api }
176
class Warning(Exception):
177
"""Important warnings like data truncations while inserting."""
178
179
class Error(Exception):
180
"""Base exception class."""
181
182
class InterfaceError(Error):
183
"""Errors related to the database interface."""
184
185
class DatabaseError(Error):
186
"""Errors related to the database."""
187
188
class DataError(DatabaseError):
189
"""Errors due to problems with processed data."""
190
191
class OperationalError(DatabaseError):
192
"""Errors related to database operation."""
193
194
class IntegrityError(DatabaseError):
195
"""Raised when relational integrity is affected."""
196
197
class InternalError(DatabaseError):
198
"""Raised when database encounters internal error."""
199
200
class ProgrammingError(DatabaseError):
201
"""Raised for programming errors."""
202
203
class NotSupportedError(DatabaseError):
204
"""Raised when method or API is not supported."""
205
206
class ImpossibleFilterError(Error):
207
"""Raised when condition impossible to meet is found."""
208
209
class UnauthenticatedError(InterfaceError):
210
"""Raised when user needs to authenticate."""
211
```