0
# Expression System
1
2
datatable's expression system provides a powerful and flexible way to reference columns and build complex data transformations using f and g objects.
3
4
## Capabilities
5
6
### Column References
7
8
Primary objects for referencing columns in Frame operations and building expressions.
9
10
```python { .api }
11
f: object
12
"""
13
Primary column reference object for selecting and transforming columns.
14
Used in most datatable operations for column selection and expressions.
15
"""
16
17
g: object
18
"""
19
Secondary column reference object primarily used in join operations
20
to distinguish between columns from different frames.
21
"""
22
```
23
24
### Expression Object
25
26
```python { .api }
27
class FExpr:
28
"""
29
Expression object representing column operations and transformations.
30
Created automatically when using f/g objects with operators.
31
"""
32
33
def __add__(self, other): ...
34
def __sub__(self, other): ...
35
def __mul__(self, other): ...
36
def __truediv__(self, other): ...
37
def __mod__(self, other): ...
38
def __pow__(self, other): ...
39
40
def __eq__(self, other): ...
41
def __ne__(self, other): ...
42
def __lt__(self, other): ...
43
def __le__(self, other): ...
44
def __gt__(self, other): ...
45
def __ge__(self, other): ...
46
47
def __and__(self, other): ...
48
def __or__(self, other): ...
49
def __invert__(self): ...
50
```
51
52
### Grouping and Joining Helpers
53
54
```python { .api }
55
def by(*args):
56
"""
57
Create grouping specification for group-by operations.
58
59
Parameters:
60
- *args: Column expressions or names to group by
61
62
Returns:
63
Grouping object for use in Frame operations
64
"""
65
66
def join(frame, on=None, how='inner'):
67
"""
68
Create join specification for joining frames.
69
70
Parameters:
71
- frame: Frame to join with
72
- on: Columns to join on (auto-detected if None)
73
- how: Join type ('inner', 'left', 'right', 'outer')
74
75
Returns:
76
Join object for use in Frame operations
77
"""
78
```
79
80
### Update Operations
81
82
```python { .api }
83
def update(**kwargs):
84
"""
85
Create update specification for adding or modifying columns.
86
87
Parameters:
88
- **kwargs: column_name=expression pairs
89
90
Returns:
91
Update object for use in Frame operations
92
"""
93
```
94
95
## Expression Examples
96
97
### Basic Column Selection
98
99
```python
100
import datatable as dt
101
from datatable import f, g
102
103
DT = dt.Frame({
104
'A': [1, 2, 3, 4, 5],
105
'B': [10, 20, 30, 40, 50],
106
'C': ['x', 'y', 'z', 'x', 'y']
107
})
108
109
# Column selection
110
DT[:, f.A] # Select column A
111
DT[:, f[0]] # Select first column by index
112
DT[:, f['A']] # Select column by name string
113
DT[:, [f.A, f.B]] # Select multiple columns
114
DT[:, f.A:f.C] # Select column range
115
DT[:, f[:]] # Select all columns
116
```
117
118
### Mathematical Expressions
119
120
```python
121
# Arithmetic operations
122
DT[:, f.A + f.B] # Add columns
123
DT[:, f.A * 2] # Multiply by scalar
124
DT[:, f.A / f.B] # Division
125
DT[:, f.A ** 2] # Power
126
DT[:, f.A % 3] # Modulo
127
128
# Create new columns with expressions
129
DT[:, dt.update(D=f.A + f.B, E=f.A * f.B)]
130
```
131
132
### Conditional Expressions
133
134
```python
135
# Boolean operations
136
DT[f.A > 2, :] # Filter rows
137
DT[f.A > f.B, :] # Compare columns
138
DT[(f.A > 2) & (f.B < 40), :] # Logical AND
139
DT[(f.A < 2) | (f.A > 4), :] # Logical OR
140
DT[~(f.A == 3), :] # Logical NOT
141
142
# Conditional assignment
143
DT[:, dt.update(flag=dt.ifelse(f.A > 3, "high", "low"))]
144
```
145
146
### String Operations
147
148
```python
149
# String operations using dt.str module
150
DT[:, dt.str.len(f.C)] # String length
151
DT[:, dt.str.slice(f.C, 0, 1)] # String slicing
152
153
# String comparisons
154
DT[f.C == 'x', :] # String equality
155
DT[dt.re.match(f.C, '[xy]'), :] # Regex matching
156
```
157
158
### Aggregation with Expressions
159
160
```python
161
# Basic aggregations
162
DT[:, dt.sum(f.A)] # Sum of column A
163
DT[:, dt.mean(f.B)] # Mean of column B
164
DT[:, dt.count()] # Row count
165
166
# Group-by operations
167
DT[:, dt.sum(f.A), dt.by(f.C)] # Sum A grouped by C
168
DT[:, [dt.sum(f.A), dt.mean(f.B)], dt.by(f.C)] # Multiple aggregations
169
170
# Multiple grouping columns
171
DT[:, dt.sum(f.A), dt.by(f.C, f.A > 2)]
172
```
173
174
### Join Operations with g
175
176
```python
177
DT2 = dt.Frame({
178
'C': ['x', 'y', 'z'],
179
'X': [100, 200, 300]
180
})
181
182
# Join using g to reference right frame columns
183
result = DT[:, [f.A, f.B, g.X], dt.join(DT2)]
184
185
# Join with conditions
186
result = DT[:, :, dt.join(DT2, on=f.C==g.C)]
187
```
188
189
### Time Series Expressions
190
191
```python
192
# Time component extraction
193
DT_time = dt.Frame({
194
'date': [dt.datetime(2023, 1, 1), dt.datetime(2023, 6, 15)]
195
})
196
197
DT_time[:, dt.time.year(f.date)] # Extract year
198
DT_time[:, dt.time.month(f.date)] # Extract month
199
DT_time[:, dt.time.day(f.date)] # Extract day
200
```
201
202
### Mathematical Functions
203
204
```python
205
# Mathematical functions
206
DT[:, dt.abs(f.A - 3)] # Absolute value
207
DT[:, dt.exp(f.A)] # Exponential
208
DT[:, dt.log(f.A)] # Natural logarithm
209
DT[:, dt.sqrt(f.A)] # Square root
210
DT[:, dt.isna(f.A)] # Check for missing values
211
```
212
213
### Row-wise Operations
214
215
```python
216
# Row-wise operations across columns
217
DT[:, dt.rowsum(f.A, f.B)] # Row-wise sum
218
DT[:, dt.rowmean(f.A, f.B)] # Row-wise mean
219
DT[:, dt.rowmax(f.A, f.B)] # Row-wise maximum
220
DT[:, dt.rowcount(f.A, f.B)] # Row-wise count of non-missing
221
```
222
223
### Complex Expressions
224
225
```python
226
# Chained operations
227
result = DT[f.A > dt.mean(f.A),
228
[f.A, f.B, dt.update(ratio=f.A/f.B)]]
229
230
# Conditional aggregations
231
DT[:, dt.sum(dt.ifelse(f.A > 3, f.B, 0)), dt.by(f.C)]
232
233
# Window functions with shift
234
DT[:, dt.update(prev_A=dt.shift(f.A, 1)), dt.by(f.C)]
235
```
236
237
## Types
238
239
### Expression Types
240
241
```python { .api }
242
class Expr:
243
"""Base expression class for column operations"""
244
pass
245
```