0
# Ibis Framework
1
2
The portable Python dataframe library that provides a unified API for data analysis across 20+ different backends including DuckDB, PostgreSQL, BigQuery, Snowflake, Spark, and many others. Ibis enables developers to write dataframe expressions once and execute them on any supported backend, facilitating seamless transitions from local development to production deployment.
3
4
## Package Information
5
6
- **Package Name**: ibis-framework
7
- **Language**: Python
8
- **Installation**: `pip install ibis-framework`
9
10
## Core Imports
11
12
```python
13
import ibis
14
```
15
16
Common patterns for working with specific backends:
17
18
```python
19
import ibis
20
# DuckDB backend (default)
21
con = ibis.duckdb.connect()
22
23
# PostgreSQL backend
24
con = ibis.postgres.connect(user="postgres", password="password", host="localhost", database="mydb")
25
26
# BigQuery backend
27
con = ibis.bigquery.connect(project_id="my-project")
28
```
29
30
## Basic Usage
31
32
```python
33
import ibis
34
import pandas as pd
35
36
# Connect to DuckDB (default backend)
37
con = ibis.duckdb.connect()
38
39
# Create a table from pandas DataFrame
40
df = pd.DataFrame({
41
'name': ['Alice', 'Bob', 'Charlie'],
42
'age': [25, 30, 35],
43
'salary': [50000, 60000, 70000]
44
})
45
employees = con.create_table('employees', df)
46
47
# Build expressions using the unified API
48
result = (
49
employees
50
.filter(employees.age > 25)
51
.select(employees.name, employees.salary)
52
.order_by(employees.salary.desc())
53
)
54
55
# Execute and get results
56
print(result.to_pandas())
57
58
# Switch to a different backend with same expressions
59
pg_con = ibis.postgres.connect(...)
60
pg_employees = pg_con.table('employees')
61
same_result = (
62
pg_employees
63
.filter(pg_employees.age > 25)
64
.select(pg_employees.name, pg_employees.salary)
65
.order_by(pg_employees.salary.desc())
66
)
67
```
68
69
## Architecture
70
71
Ibis uses a lazy evaluation system with backend abstraction:
72
73
- **Expression Layer**: Unified API that builds computation graphs independent of backend
74
- **Backend Layer**: 20+ backend implementations that compile expressions to backend-specific queries
75
- **Type System**: Rich type system ensuring consistent behavior across backends
76
- **Configuration**: Global and backend-specific configuration options
77
78
The unified API allows writing portable data analysis code that can run on local engines (DuckDB, Polars), traditional databases (PostgreSQL, MySQL), cloud data warehouses (BigQuery, Snowflake), and distributed systems (Spark, Trino).
79
80
## Capabilities
81
82
### Table Construction and Data I/O
83
84
Core functions for creating table expressions from various data sources including in-memory data, files (CSV, Parquet, JSON), and database connections.
85
86
```python { .api }
87
def table(schema=None, name=None, catalog=None, database=None): ...
88
def memtable(data, /, *, columns=None, schema=None, name=None): ...
89
def read_csv(path, **kwargs): ...
90
def read_parquet(path, **kwargs): ...
91
def read_json(path, **kwargs): ...
92
def connect(backend, **kwargs): ...
93
```
94
95
[Table Construction and I/O](./table-construction.md)
96
97
### Expression Building and Computation
98
99
Fundamental expression building blocks for creating scalar values, arrays, structs, and complex computations with support for parameters and deferred expressions.
100
101
```python { .api }
102
def literal(value, type=None): ...
103
def null(type=None): ...
104
def array(values, type=None): ...
105
def struct(mapping): ...
106
def param(type): ...
107
def case(): ...
108
def ifelse(condition, true_expr, false_expr): ...
109
```
110
111
[Expression Building](./expressions.md)
112
113
### Table Operations and Transformations
114
115
Comprehensive table operations including filtering, selection, aggregation, joins, sorting, and window functions for data transformation and analysis.
116
117
```python { .api }
118
# Table methods
119
table.select(*exprs): ...
120
table.filter(predicates): ...
121
table.group_by(*exprs): ...
122
table.aggregate(**kwargs): ...
123
table.join(other, predicates): ...
124
table.order_by(*exprs): ...
125
table.limit(n): ...
126
```
127
128
[Table Operations](./table-operations.md)
129
130
### Temporal Functions and Date/Time Operations
131
132
Extensive date, time, and timestamp functionality including construction, arithmetic, formatting, and timezone handling for temporal data analysis.
133
134
```python { .api }
135
def date(year, month, day): ...
136
def time(hour, minute, second): ...
137
def timestamp(year, month, day, hour, minute, second): ...
138
def now(): ...
139
def today(): ...
140
def interval(**kwargs): ...
141
```
142
143
[Temporal Operations](./temporal.md)
144
145
### Aggregation and Window Functions
146
147
Statistical aggregation functions and window operations including ranking, cumulative calculations, and frame-based computations for advanced analytics.
148
149
```python { .api }
150
def sum(arg): ...
151
def mean(arg): ...
152
def count(arg): ...
153
def row_number(): ...
154
def rank(): ...
155
def dense_rank(): ...
156
def window(**kwargs): ...
157
```
158
159
[Aggregation and Windows](./aggregation-windows.md)
160
161
### Column Selectors
162
163
Flexible column selection system with pattern matching, type-based selection, and logical combinations for working with wide datasets.
164
165
```python { .api }
166
selectors.all(): ...
167
selectors.numeric(): ...
168
selectors.matches(pattern): ...
169
selectors.startswith(prefix): ...
170
selectors.of_type(*types): ...
171
```
172
173
[Column Selectors](./selectors.md)
174
175
### Backend Management
176
177
Backend connection, configuration, and management functions for working with different data processing engines and databases.
178
179
```python { .api }
180
def get_backend(table): ...
181
def set_backend(backend): ...
182
def list_backends(): ...
183
backend.connect(**kwargs): ...
184
backend.compile(expr): ...
185
```
186
187
[Backend Management](./backends.md)
188
189
### User-Defined Functions
190
191
Comprehensive UDF system supporting scalar, aggregate, and analytic functions with type safety and backend compatibility.
192
193
```python { .api }
194
@ibis.udf.scalar(signature)
195
def my_function(arg): ...
196
197
@ibis.udf.aggregate(signature)
198
def my_aggregate(arg): ...
199
```
200
201
[User-Defined Functions](./udfs.md)
202
203
### SQL Integration
204
205
Bidirectional SQL integration allowing parsing SQL into expressions and compiling expressions to SQL with backend-specific optimizations.
206
207
```python { .api }
208
def parse_sql(sql, dialect=None): ...
209
def to_sql(expr, dialect=None): ...
210
def decompile(expr): ...
211
```
212
213
[SQL Integration](./sql-integration.md)
214
215
### Configuration and Options
216
217
Global and backend-specific configuration system for controlling behavior, output formatting, and performance optimizations.
218
219
```python { .api }
220
ibis.options.sql.default_limit = 10000
221
ibis.options.interactive.mode = True
222
ibis.options.repr.interactive.max_rows = 20
223
```
224
225
[Configuration](./configuration.md)
226
227
## Types
228
229
Core data types and type system components.
230
231
```python { .api }
232
# Type constructors
233
dtype(type_spec): DataType
234
infer_dtype(value): DataType
235
infer_schema(data): Schema
236
237
# Common types
238
int64(): DataType
239
float64(): DataType
240
string(): DataType
241
boolean(): DataType
242
timestamp(): DataType
243
array(value_type): DataType
244
struct(fields): DataType
245
```
246
247
## Core Classes
248
249
Fundamental classes for working with Ibis expressions and data structures.
250
251
```python { .api }
252
class DataType:
253
"""Base class for all Ibis data types"""
254
255
class Expr:
256
"""Base class for all Ibis expressions"""
257
258
class Value(Expr):
259
"""Base class for value expressions (scalars, arrays, structs, etc.)"""
260
261
class Scalar(Value):
262
"""Scalar value expressions (single values)"""
263
264
class Column(Value):
265
"""Column expressions referencing table columns"""
266
```
267
268
## Common Exceptions
269
270
```python { .api }
271
class IbisError(Exception):
272
"""Base exception for all Ibis errors"""
273
274
class IbisInputError(IbisError):
275
"""Raised for invalid input arguments"""
276
277
class IbisTypeError(IbisError):
278
"""Raised for type-related errors"""
279
```