0
# Column Selectors
1
2
Flexible column selection system with pattern matching, type-based selection, and logical combinations for working with wide datasets.
3
4
## Capabilities
5
6
### Basic Selection
7
8
Select all columns or specific column patterns.
9
10
```python { .api }
11
def all():
12
"""
13
Select all columns.
14
15
Returns:
16
Selector for all columns
17
"""
18
19
def c(*names):
20
"""
21
Select columns by name.
22
23
Parameters:
24
- *names: column names to select
25
26
Returns:
27
Selector for named columns
28
"""
29
30
def cols(*names):
31
"""
32
Alias for c() - select columns by name.
33
34
Parameters:
35
- *names: column names to select
36
37
Returns:
38
Selector for named columns
39
"""
40
```
41
42
**Usage Examples:**
43
```python
44
import ibis
45
from ibis import selectors as s
46
47
# Select all columns
48
table.select(s.all())
49
50
# Select specific columns
51
table.select(s.c('name', 'age', 'salary'))
52
53
# Same as above
54
table.select(s.cols('name', 'age', 'salary'))
55
```
56
57
### Pattern Matching
58
59
Select columns based on name patterns.
60
61
```python { .api }
62
def matches(pattern):
63
"""
64
Select columns matching regex pattern.
65
66
Parameters:
67
- pattern: str, regular expression pattern
68
69
Returns:
70
Selector for columns matching pattern
71
"""
72
73
def startswith(prefix):
74
"""
75
Select columns starting with prefix.
76
77
Parameters:
78
- prefix: str, prefix to match
79
80
Returns:
81
Selector for columns with matching prefix
82
"""
83
84
def endswith(suffix):
85
"""
86
Select columns ending with suffix.
87
88
Parameters:
89
- suffix: str, suffix to match
90
91
Returns:
92
Selector for columns with matching suffix
93
"""
94
95
def contains(substring):
96
"""
97
Select columns containing substring.
98
99
Parameters:
100
- substring: str, substring to match
101
102
Returns:
103
Selector for columns containing substring
104
"""
105
```
106
107
**Usage Examples:**
108
```python
109
from ibis import selectors as s
110
111
# Regex pattern matching
112
table.select(s.matches(r'.*_id$')) # Columns ending with '_id'
113
114
# Prefix matching
115
table.select(s.startswith('sales_')) # sales_2021, sales_2022, etc.
116
117
# Suffix matching
118
table.select(s.endswith('_total')) # revenue_total, cost_total, etc.
119
120
# Substring matching
121
table.select(s.contains('temp')) # temperature, temporary, template, etc.
122
```
123
124
### Type-Based Selection
125
126
Select columns based on their data types.
127
128
```python { .api }
129
def numeric():
130
"""
131
Select numeric columns (integers and floats).
132
133
Returns:
134
Selector for numeric columns
135
"""
136
137
def of_type(*types):
138
"""
139
Select columns of specific types.
140
141
Parameters:
142
- *types: DataType objects or type strings
143
144
Returns:
145
Selector for columns of specified types
146
"""
147
148
def categorical():
149
"""
150
Select categorical/string columns.
151
152
Returns:
153
Selector for categorical columns
154
"""
155
156
def temporal():
157
"""
158
Select temporal columns (date, time, timestamp).
159
160
Returns:
161
Selector for temporal columns
162
"""
163
```
164
165
**Usage Examples:**
166
```python
167
from ibis import selectors as s
168
169
# Select all numeric columns
170
table.select(s.numeric())
171
172
# Select specific types
173
table.select(s.of_type('string', 'int64'))
174
175
# Select string columns
176
table.select(s.categorical())
177
178
# Select date/time columns
179
table.select(s.temporal())
180
```
181
182
### Conditional Selection
183
184
Select columns based on predicates or conditions.
185
186
```python { .api }
187
def where(predicate):
188
"""
189
Select columns matching a predicate function.
190
191
Parameters:
192
- predicate: callable that takes column and returns bool
193
194
Returns:
195
Selector for columns matching predicate
196
"""
197
```
198
199
**Usage Examples:**
200
```python
201
from ibis import selectors as s
202
203
# Select columns based on custom predicate
204
def has_nulls(col):
205
return col.isnull().any()
206
207
table.select(s.where(has_nulls))
208
209
# Complex predicate
210
def numeric_with_high_variance(col):
211
return col.type().is_numeric() and col.std() > 100
212
213
table.select(s.where(numeric_with_high_variance))
214
```
215
216
### Selector Combinations
217
218
Combine selectors using logical operations.
219
220
```python { .api }
221
# Logical operations on selectors
222
selector1 & selector2 # AND - columns matching both selectors
223
selector1 | selector2 # OR - columns matching either selector
224
~selector # NOT - columns not matching selector
225
```
226
227
**Usage Examples:**
228
```python
229
from ibis import selectors as s
230
231
# Combine selectors with AND
232
numeric_sales = s.numeric() & s.startswith('sales_')
233
table.select(numeric_sales)
234
235
# Combine with OR
236
id_or_name = s.endswith('_id') | s.contains('name')
237
table.select(id_or_name)
238
239
# Negate selector
240
non_numeric = ~s.numeric()
241
table.select(non_numeric)
242
243
# Complex combinations
244
important_cols = (
245
s.c('id', 'name') | # Always include id and name
246
(s.numeric() & ~s.contains('temp')) # Numeric but not temporary
247
)
248
table.select(important_cols)
249
```
250
251
### Functional Application
252
253
Apply functions across selected columns.
254
255
```python { .api }
256
def across(selector, func, names=None):
257
"""
258
Apply function across selected columns.
259
260
Parameters:
261
- selector: column selector
262
- func: function to apply to each selected column
263
- names: naming pattern for result columns
264
265
Returns:
266
Dict of name -> expression for selected columns
267
"""
268
```
269
270
**Usage Examples:**
271
```python
272
from ibis import selectors as s
273
274
# Apply function to multiple columns
275
table.select(
276
s.across(s.numeric(), lambda x: x.mean(), names='{}_avg')
277
)
278
279
# Multiple transformations
280
table.select(
281
'id', 'name', # Keep identifier columns
282
**s.across(s.numeric(), lambda x: x.fillna(0)), # Fill nulls in numeric
283
**s.across(s.categorical(), lambda x: x.upper()) # Uppercase strings
284
)
285
```
286
287
### Conditional Column Operations
288
289
Conditional selection based on column values.
290
291
```python { .api }
292
def if_any(*selectors):
293
"""
294
Select if any of the selectors match.
295
296
Parameters:
297
- *selectors: selectors to check
298
299
Returns:
300
Selector matching if any selector matches
301
"""
302
303
def if_all(*selectors):
304
"""
305
Select if all selectors match.
306
307
Parameters:
308
- *selectors: selectors to check
309
310
Returns:
311
Selector matching if all selectors match
312
"""
313
```
314
315
**Usage Examples:**
316
```python
317
from ibis import selectors as s
318
319
# Select if any condition matches
320
flexible = s.if_any(
321
s.contains('revenue'),
322
s.contains('profit'),
323
s.contains('income')
324
)
325
326
# Select if all conditions match
327
strict = s.if_all(
328
s.numeric(),
329
s.startswith('sales_'),
330
s.endswith('_2023')
331
)
332
333
table.select(flexible)
334
table.select(strict)
335
```
336
337
### Common Selection Patterns
338
339
Frequently used selector combinations.
340
341
**Usage Examples:**
342
```python
343
from ibis import selectors as s
344
345
# Financial columns
346
financial = s.matches(r'.*(revenue|profit|cost|price).*')
347
348
# ID columns
349
ids = s.endswith('_id') | s.endswith('_key') | s.matches(r'^id$')
350
351
# Metrics (numeric, not IDs)
352
metrics = s.numeric() & ~ids
353
354
# Clean dataset selection
355
clean_data = (
356
ids | # Always keep identifiers
357
s.where(lambda col: ~col.isnull().all()) # Remove all-null columns
358
)
359
360
# Time series columns
361
time_series = s.temporal() | s.matches(r'.*_(date|time|timestamp).*')
362
363
table.select(clean_data)
364
```