or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

aggregation-windows.mdbackends.mdconfiguration.mdexpressions.mdindex.mdselectors.mdsql-integration.mdtable-construction.mdtable-operations.mdtemporal.mdudfs.md

table-operations.mddocs/

0

# Table Operations and Transformations

1

2

Comprehensive table operations including filtering, selection, aggregation, joins, sorting, and transformations for data analysis.

3

4

## Capabilities

5

6

### Column Selection

7

8

Select specific columns or computed expressions from tables.

9

10

```python { .api }

11

def select(*exprs):

12

"""

13

Select columns or computed expressions.

14

15

Parameters:

16

- *exprs: column names, expressions, or keyword expressions

17

18

Returns:

19

Table with selected columns

20

"""

21

```

22

23

**Usage Examples:**

24

```python

25

# Select specific columns

26

result = table.select('name', 'age')

27

28

# Select with expressions

29

result = table.select(

30

'name',

31

age_next_year=table.age + 1,

32

is_adult=table.age >= 18

33

)

34

35

# Select all columns

36

result = table.select('*')

37

```

38

39

### Row Filtering

40

41

Filter rows based on boolean predicates.

42

43

```python { .api }

44

def filter(*predicates):

45

"""

46

Filter rows based on predicates.

47

48

Parameters:

49

- *predicates: boolean expressions

50

51

Returns:

52

Table with filtered rows

53

"""

54

```

55

56

**Usage Examples:**

57

```python

58

# Single condition

59

result = table.filter(table.age > 25)

60

61

# Multiple conditions (AND)

62

result = table.filter(table.age > 25, table.salary > 50000)

63

64

# Complex conditions

65

result = table.filter(

66

(table.age > 25) & (table.department == 'Engineering')

67

)

68

```

69

70

### Grouping and Aggregation

71

72

Group rows and compute aggregate statistics.

73

74

```python { .api }

75

def group_by(*exprs):

76

"""

77

Group table by expressions.

78

79

Parameters:

80

- *exprs: column names or expressions to group by

81

82

Returns:

83

GroupedTable for aggregation

84

"""

85

86

def aggregate(**kwargs):

87

"""

88

Aggregate entire table or grouped table.

89

90

Parameters:

91

- **kwargs: name=expression pairs for aggregations

92

93

Returns:

94

Table with aggregated results

95

"""

96

```

97

98

**Usage Examples:**

99

```python

100

# Group by single column

101

result = table.group_by('department').aggregate(

102

avg_salary=table.salary.mean(),

103

count=table.count()

104

)

105

106

# Group by multiple columns

107

result = table.group_by('department', 'level').aggregate(

108

total_salary=table.salary.sum(),

109

employee_count=table.count()

110

)

111

112

# Aggregate without grouping

113

result = table.aggregate(

114

total_employees=table.count(),

115

avg_age=table.age.mean()

116

)

117

```

118

119

### Sorting

120

121

Sort table rows by one or more expressions.

122

123

```python { .api }

124

def order_by(*exprs):

125

"""

126

Sort table by expressions.

127

128

Parameters:

129

- *exprs: column names, expressions, or sort keys

130

131

Returns:

132

Sorted table

133

"""

134

135

def asc(expr):

136

"""Create ascending sort key."""

137

138

def desc(expr):

139

"""Create descending sort key."""

140

```

141

142

**Usage Examples:**

143

```python

144

# Sort by single column

145

result = table.order_by('name')

146

147

# Sort by multiple columns

148

result = table.order_by('department', 'salary')

149

150

# Explicit sort direction

151

result = table.order_by(

152

ibis.asc('department'),

153

ibis.desc('salary')

154

)

155

```

156

157

### Row Limiting

158

159

Limit the number of rows returned.

160

161

```python { .api }

162

def limit(n, offset=0):

163

"""

164

Limit number of rows.

165

166

Parameters:

167

- n: int, maximum number of rows

168

- offset: int, number of rows to skip

169

170

Returns:

171

Table with limited rows

172

"""

173

174

def head(n=5):

175

"""Return first n rows."""

176

177

def tail(n=5):

178

"""Return last n rows (requires ordering)."""

179

```

180

181

**Usage Examples:**

182

```python

183

# Top 10 rows

184

result = table.limit(10)

185

186

# Skip first 20, take next 10

187

result = table.limit(10, offset=20)

188

189

# Convenient methods

190

top_5 = table.head(5)

191

bottom_5 = table.order_by('salary').tail(5)

192

```

193

194

### Joins

195

196

Join tables using various join types and conditions.

197

198

```python { .api }

199

def join(other, predicates=None, how='inner'):

200

"""

201

Join with another table.

202

203

Parameters:

204

- other: Table to join with

205

- predicates: join conditions or None for cross join

206

- how: join type ('inner', 'left', 'right', 'outer', 'semi', 'anti')

207

208

Returns:

209

Joined table

210

"""

211

212

def cross_join(other):

213

"""Cross join (Cartesian product)."""

214

215

def asof_join(other, predicates, tolerance=None):

216

"""As-of join for time-series data."""

217

```

218

219

**Usage Examples:**

220

```python

221

# Inner join

222

result = employees.join(

223

departments,

224

employees.dept_id == departments.id

225

)

226

227

# Left join with multiple conditions

228

result = employees.join(

229

departments,

230

[employees.dept_id == departments.id,

231

employees.location == departments.location],

232

how='left'

233

)

234

235

# Cross join

236

result = table1.cross_join(table2)

237

```

238

239

### Set Operations

240

241

Combine tables using set operations.

242

243

```python { .api }

244

def union(table, /, *rest, distinct=False):

245

"""

246

Compute multiset (or set) union of multiple tables.

247

248

Parameters:

249

- table: Table to union with (positional-only)

250

- *rest: Additional tables to union

251

- distinct: bool, use set union (True) or multiset union (False, default)

252

253

Returns:

254

Union of all tables

255

"""

256

257

def intersect(other):

258

"""Intersection with another table."""

259

260

def difference(other):

261

"""Difference from another table."""

262

```

263

264

**Usage Examples:**

265

```python

266

# Union tables (multiset - keeps duplicates by default)

267

combined = table1.union(table2)

268

269

# Union with distinct values only

270

combined = table1.union(table2, distinct=True)

271

272

# Set operations

273

common = table1.intersect(table2)

274

unique_to_first = table1.difference(table2)

275

```

276

277

### Column Operations

278

279

Add, drop, and rename columns.

280

281

```python { .api }

282

def mutate(**kwargs):

283

"""

284

Add or modify columns.

285

286

Parameters:

287

- **kwargs: name=expression pairs

288

289

Returns:

290

Table with new/modified columns

291

"""

292

293

def drop(*columns):

294

"""

295

Drop columns.

296

297

Parameters:

298

- *columns: column names to drop

299

300

Returns:

301

Table without specified columns

302

"""

303

304

def rename(**kwargs):

305

"""

306

Rename columns.

307

308

Parameters:

309

- **kwargs: old_name=new_name pairs

310

311

Returns:

312

Table with renamed columns

313

"""

314

```

315

316

**Usage Examples:**

317

```python

318

# Add computed columns

319

result = table.mutate(

320

age_next_year=table.age + 1,

321

full_name=table.first_name + ' ' + table.last_name

322

)

323

324

# Drop columns

325

result = table.drop('temp_column', 'unused_field')

326

327

# Rename columns

328

result = table.rename(old_name='new_name', id='employee_id')

329

```

330

331

### Distinct Values

332

333

Remove duplicate rows or get unique values.

334

335

```python { .api }

336

def distinct(*exprs):

337

"""

338

Get distinct rows or distinct values for expressions.

339

340

Parameters:

341

- *exprs: expressions to consider for distinctness (all columns if empty)

342

343

Returns:

344

Table with distinct rows

345

"""

346

```

347

348

**Usage Examples:**

349

```python

350

# Distinct rows

351

unique_rows = table.distinct()

352

353

# Distinct values for specific columns

354

unique_combinations = table.distinct('department', 'level')

355

```

356

357

### Sampling

358

359

Sample random rows from the table.

360

361

```python { .api }

362

def sample(fraction=None, method='row'):

363

"""

364

Sample rows from table.

365

366

Parameters:

367

- fraction: float, fraction of rows to sample (0.0 to 1.0)

368

- method: str, sampling method

369

370

Returns:

371

Sampled table

372

"""

373

```

374

375

**Usage Example:**

376

```python

377

# Sample 10% of rows

378

sample = table.sample(0.1)

379

```