or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-parsing.mddialects.mdexecution.mdexpression-building.mdindex.mdoptimization.mdschema.mdutilities.md

optimization.mddocs/

0

# SQL Optimization

1

2

Comprehensive query optimization engine with 15+ optimization rules for improving query performance, simplifying expressions, and analyzing query structure. The optimizer transforms SQL expressions to equivalent but more efficient forms.

3

4

## Capabilities

5

6

### Main Optimization Engine

7

8

Core optimization functionality that applies multiple optimization rules to SQL expressions.

9

10

```python { .api }

11

def optimize(

12

sql: str | Expression,

13

schema: Optional[Schema] = None,

14

dialect: str = None,

15

**opts

16

) -> Expression:

17

"""

18

Applies optimization rules to SQL expressions for improved performance.

19

20

Args:

21

sql: SQL string or Expression to optimize

22

schema (Schema): Database schema for column/table resolution

23

dialect (str): SQL dialect for parsing and optimization

24

**opts: Additional optimization options

25

26

Returns:

27

Expression: Optimized expression tree

28

"""

29

30

RULES: List[Callable] = [

31

# List of available optimization rules

32

# Applied in sequence for maximum effectiveness

33

]

34

```

35

36

### Type Annotation and Analysis

37

38

Add type information to expressions for better optimization and validation.

39

40

```python { .api }

41

def annotate_types(

42

expression: Expression,

43

schema: Optional[Schema] = None,

44

**opts

45

) -> Expression:

46

"""

47

Annotates expressions with type information from schema.

48

49

Args:

50

expression (Expression): Expression to annotate

51

schema (Schema): Schema providing type information

52

**opts: Type annotation options

53

54

Returns:

55

Expression: Expression with type annotations

56

"""

57

```

58

59

### Column and Table Qualification

60

61

Resolve column and table references using schema information.

62

63

```python { .api }

64

def qualify_columns(

65

expression: Expression,

66

schema: Schema,

67

**opts

68

) -> Expression:

69

"""

70

Qualifies column references with table names using schema.

71

72

Args:

73

expression (Expression): Expression to qualify

74

schema (Schema): Schema for column resolution

75

**opts: Qualification options

76

77

Returns:

78

Expression: Expression with qualified column references

79

"""

80

81

def qualify_tables(

82

expression: Expression,

83

schema: Optional[Schema] = None,

84

**opts

85

) -> Expression:

86

"""

87

Qualifies table references with database/schema names.

88

89

Args:

90

expression (Expression): Expression to qualify

91

schema (Schema): Schema for table resolution

92

**opts: Qualification options

93

94

Returns:

95

Expression: Expression with qualified table references

96

"""

97

```

98

99

### Predicate Optimization

100

101

Optimize WHERE clause conditions for better performance.

102

103

```python { .api }

104

def pushdown_predicates(expression: Expression, **opts) -> Expression:

105

"""

106

Pushes WHERE predicates down to lowest possible level in query tree.

107

108

Args:

109

expression (Expression): Expression to optimize

110

**opts: Pushdown options

111

112

Returns:

113

Expression: Expression with optimized predicate placement

114

"""

115

116

def pushdown_projections(expression: Expression, **opts) -> Expression:

117

"""

118

Pushes SELECT projections down to reduce data movement.

119

120

Args:

121

expression (Expression): Expression to optimize

122

**opts: Projection pushdown options

123

124

Returns:

125

Expression: Expression with optimized projections

126

"""

127

```

128

129

### Subquery Optimization

130

131

Eliminate and optimize subqueries for better performance.

132

133

```python { .api }

134

def eliminate_subqueries(expression: Expression, **opts) -> Expression:

135

"""

136

Eliminates unnecessary subqueries by flattening or merging.

137

138

Args:

139

expression (Expression): Expression to optimize

140

**opts: Subquery elimination options

141

142

Returns:

143

Expression: Expression with eliminated subqueries

144

"""

145

146

def unnest_subqueries(expression: Expression, **opts) -> Expression:

147

"""

148

Unnests subqueries where possible for performance.

149

150

Args:

151

expression (Expression): Expression to optimize

152

**opts: Unnesting options

153

154

Returns:

155

Expression: Expression with unnested subqueries

156

"""

157

158

def merge_subqueries(expression: Expression, **opts) -> Expression:

159

"""

160

Merges compatible subqueries to reduce query complexity.

161

162

Args:

163

expression (Expression): Expression to optimize

164

**opts: Merge options

165

166

Returns:

167

Expression: Expression with merged subqueries

168

"""

169

```

170

171

### Join Optimization

172

173

Optimize JOIN operations and eliminate unnecessary joins.

174

175

```python { .api }

176

def optimize_joins(expression: Expression, **opts) -> Expression:

177

"""

178

Optimizes JOIN conditions and order for better performance.

179

180

Args:

181

expression (Expression): Expression to optimize

182

**opts: Join optimization options

183

184

Returns:

185

Expression: Expression with optimized joins

186

"""

187

188

def eliminate_joins(expression: Expression, **opts) -> Expression:

189

"""

190

Eliminates unnecessary JOINs that don't affect results.

191

192

Args:

193

expression (Expression): Expression to optimize

194

**opts: Join elimination options

195

196

Returns:

197

Expression: Expression with eliminated joins

198

"""

199

```

200

201

### CTE and Expression Simplification

202

203

Optimize Common Table Expressions and simplify complex expressions.

204

205

```python { .api }

206

def eliminate_ctes(expression: Expression, **opts) -> Expression:

207

"""

208

Eliminates unnecessary Common Table Expressions.

209

210

Args:

211

expression (Expression): Expression to optimize

212

**opts: CTE elimination options

213

214

Returns:

215

Expression: Expression with eliminated CTEs

216

"""

217

218

def simplify(expression: Expression, **opts) -> Expression:

219

"""

220

Simplifies expressions by applying algebraic rules.

221

222

Args:

223

expression (Expression): Expression to simplify

224

**opts: Simplification options

225

226

Returns:

227

Expression: Simplified expression

228

"""

229

230

def canonicalize(expression: Expression, **opts) -> Expression:

231

"""

232

Canonicalizes expressions to standard form.

233

234

Args:

235

expression (Expression): Expression to canonicalize

236

**opts: Canonicalization options

237

238

Returns:

239

Expression: Canonicalized expression

240

"""

241

```

242

243

### Schema and Identifier Normalization

244

245

Normalize identifiers and schema references for consistency.

246

247

```python { .api }

248

def normalize(expression: Expression, **opts) -> Expression:

249

"""

250

Normalizes expressions to consistent format.

251

252

Args:

253

expression (Expression): Expression to normalize

254

**opts: Normalization options

255

256

Returns:

257

Expression: Normalized expression

258

"""

259

260

def normalize_identifiers(expression: Expression, **opts) -> Expression:

261

"""

262

Normalizes identifier casing and quoting.

263

264

Args:

265

expression (Expression): Expression to normalize

266

**opts: Identifier normalization options

267

268

Returns:

269

Expression: Expression with normalized identifiers

270

"""

271

272

def isolate_table_selects(expression: Expression, **opts) -> Expression:

273

"""

274

Isolates table SELECT expressions for optimization.

275

276

Args:

277

expression (Expression): Expression to isolate

278

**opts: Isolation options

279

280

Returns:

281

Expression: Expression with isolated table selects

282

"""

283

```

284

285

### Scope Analysis

286

287

Analyze and traverse expression scopes for column and table resolution.

288

289

```python { .api }

290

class Scope:

291

"""Represents a scope in SQL expression for variable resolution."""

292

293

def __init__(self, expression: Expression, sources: Dict = None): ...

294

295

@property

296

def columns(self) -> List[Column]:

297

"""Available columns in this scope."""

298

299

@property

300

def tables(self) -> List[Table]:

301

"""Available tables in this scope."""

302

303

def build_scope(expression: Expression, **opts) -> Scope:

304

"""

305

Builds scope tree for expression with column/table resolution.

306

307

Args:

308

expression (Expression): Expression to analyze

309

**opts: Scope building options

310

311

Returns:

312

Scope: Root scope of expression tree

313

"""

314

315

def traverse_scope(scope: Scope, **opts) -> Iterator[Scope]:

316

"""

317

Traverses scope hierarchy depth-first.

318

319

Args:

320

scope (Scope): Root scope to traverse

321

**opts: Traversal options

322

323

Yields:

324

Scope: Each scope in traversal order

325

"""

326

327

def find_in_scope(scope: Scope, expression_type: Type, **opts) -> Optional[Expression]:

328

"""

329

Finds first expression of type in scope.

330

331

Args:

332

scope (Scope): Scope to search

333

expression_type (Type): Expression type to find

334

**opts: Search options

335

336

Returns:

337

Optional[Expression]: Found expression or None

338

"""

339

340

def find_all_in_scope(scope: Scope, expression_type: Type, **opts) -> List[Expression]:

341

"""

342

Finds all expressions of type in scope.

343

344

Args:

345

scope (Scope): Scope to search

346

expression_type (Type): Expression type to find

347

**opts: Search options

348

349

Returns:

350

List[Expression]: List of found expressions

351

"""

352

353

def walk_in_scope(scope: Scope, **opts) -> Iterator[Expression]:

354

"""

355

Walks all expressions within scope.

356

357

Args:

358

scope (Scope): Scope to walk

359

**opts: Walk options

360

361

Yields:

362

Expression: Each expression in scope

363

"""

364

```

365

366

## Usage Examples

367

368

### Basic Optimization

369

370

```python

371

import sqlglot

372

from sqlglot.optimizer import optimize

373

from sqlglot.schema import MappingSchema

374

375

# Define schema for optimization

376

schema = MappingSchema({

377

"users": {

378

"id": "INT",

379

"name": "VARCHAR",

380

"email": "VARCHAR",

381

"created_date": "DATE"

382

},

383

"orders": {

384

"id": "INT",

385

"user_id": "INT",

386

"amount": "DECIMAL",

387

"order_date": "DATE"

388

}

389

})

390

391

# Optimize complex query

392

sql = """

393

SELECT u.name, COUNT(o.id) as order_count

394

FROM users u

395

LEFT JOIN orders o ON u.id = o.user_id

396

WHERE u.created_date > '2023-01-01'

397

AND (SELECT COUNT(*) FROM orders WHERE user_id = u.id) > 5

398

GROUP BY u.name

399

"""

400

401

optimized = optimize(sql, schema=schema, dialect="postgres")

402

print(optimized.sql(pretty=True))

403

```

404

405

### Individual Optimization Rules

406

407

```python

408

import sqlglot

409

from sqlglot.optimizer import (

410

annotate_types, qualify_columns, pushdown_predicates,

411

eliminate_subqueries, simplify

412

)

413

from sqlglot.schema import MappingSchema

414

415

# Apply specific optimization rules

416

expression = sqlglot.parse_one("""

417

SELECT name FROM users

418

WHERE age > (SELECT AVG(age) FROM users) + 10 - 5

419

""")

420

421

schema = MappingSchema({"users": {"name": "VARCHAR", "age": "INT"}})

422

423

# Step-by-step optimization

424

expression = annotate_types(expression, schema=schema)

425

expression = qualify_columns(expression, schema=schema)

426

expression = pushdown_predicates(expression)

427

expression = eliminate_subqueries(expression)

428

expression = simplify(expression)

429

430

print(expression.sql(pretty=True))

431

```

432

433

### Scope Analysis

434

435

```python

436

import sqlglot

437

from sqlglot.optimizer import build_scope, traverse_scope

438

439

# Analyze query scope

440

sql = """

441

SELECT u.name, o.amount

442

FROM users u

443

JOIN (

444

SELECT user_id, amount

445

FROM orders

446

WHERE amount > 100

447

) o ON u.id = o.user_id

448

"""

449

450

expression = sqlglot.parse_one(sql)

451

root_scope = build_scope(expression)

452

453

# Traverse scopes

454

for scope in traverse_scope(root_scope):

455

print(f"Scope columns: {[c.name for c in scope.columns]}")

456

print(f"Scope tables: {[t.name for t in scope.tables]}")

457

```

458

459

### Custom Optimization Pipeline

460

461

```python

462

import sqlglot

463

from sqlglot.optimizer import (

464

annotate_types, qualify_columns, qualify_tables,

465

pushdown_predicates, eliminate_subqueries, optimize_joins,

466

simplify, canonicalize

467

)

468

469

def custom_optimize(expression, schema=None):

470

"""Custom optimization pipeline with specific rules."""

471

472

# Core qualification and type annotation

473

if schema:

474

expression = annotate_types(expression, schema=schema)

475

expression = qualify_tables(expression, schema=schema)

476

expression = qualify_columns(expression, schema=schema)

477

478

# Predicate and projection optimization

479

expression = pushdown_predicates(expression)

480

481

# Subquery optimization

482

expression = eliminate_subqueries(expression)

483

484

# Join optimization

485

expression = optimize_joins(expression)

486

487

# Expression simplification

488

expression = simplify(expression)

489

expression = canonicalize(expression)

490

491

return expression

492

493

# Use custom optimizer

494

sql = "SELECT * FROM users WHERE age > 25 AND status = 'active'"

495

expression = sqlglot.parse_one(sql)

496

optimized = custom_optimize(expression)

497

print(optimized.sql())

498

```

499

500

## Types

501

502

```python { .api }

503

class Scope:

504

"""Query scope for column and table resolution."""

505

506

expression: Expression # Root expression of scope

507

sources: Dict # Available data sources

508

509

@property

510

def columns(self) -> List[Column]:

511

"""Columns available in this scope."""

512

513

@property

514

def tables(self) -> List[Table]:

515

"""Tables available in this scope."""

516

517

def find(self, expression_type: Type) -> Optional[Expression]:

518

"""Find expression of specific type in scope."""

519

520

def find_all(self, expression_type: Type) -> List[Expression]:

521

"""Find all expressions of specific type in scope."""

522

523

# Optimization rule type

524

OptimizationRule = Callable[[Expression], Expression]

525

526

# Available optimization rules

527

RULES: List[OptimizationRule] = [

528

annotate_types,

529

qualify_columns,

530

qualify_tables,

531

pushdown_predicates,

532

pushdown_projections,

533

eliminate_subqueries,

534

unnest_subqueries,

535

merge_subqueries,

536

optimize_joins,

537

eliminate_joins,

538

eliminate_ctes,

539

simplify,

540

canonicalize,

541

normalize,

542

normalize_identifiers,

543

isolate_table_selects

544

]

545

```