or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-parsing.mddialects.mdexecution.mdexpression-building.mdindex.mdoptimization.mdschema.mdutilities.md

index.mddocs/

0

# SQLGlot

1

2

A comprehensive no-dependency SQL parser, transpiler, optimizer, and engine written in Python. SQLGlot supports translation between 30+ different SQL dialects including DuckDB, Presto/Trino, Spark/Databricks, Snowflake, and BigQuery, providing robust SQL parsing capabilities with syntax error detection, comprehensive formatting and optimization features, AST manipulation tools for programmatic SQL building, and a built-in SQL execution engine.

3

4

## Package Information

5

6

- **Package Name**: sqlglot

7

- **Language**: Python

8

- **Installation**: `pip install sqlglot` or `pip install "sqlglot[rs]"` (with Rust tokenizer for better performance)

9

10

## Core Imports

11

12

```python

13

import sqlglot

14

```

15

16

Common for parsing and transpilation:

17

18

```python

19

from sqlglot import parse, parse_one, transpile, tokenize

20

```

21

22

Expression building:

23

24

```python

25

from sqlglot import select, column, table_, func, and_, or_, case

26

```

27

28

Advanced features:

29

30

```python

31

from sqlglot import Expression, Dialect, Parser, Generator

32

from sqlglot.optimizer import optimize

33

from sqlglot.executor import execute

34

from sqlglot.schema import MappingSchema

35

```

36

37

## Basic Usage

38

39

```python

40

import sqlglot

41

42

# Parse SQL into abstract syntax tree

43

sql = "SELECT user_id, COUNT(*) FROM users WHERE age > 21 GROUP BY user_id"

44

parsed = sqlglot.parse_one(sql)

45

46

# Transpile between SQL dialects

47

spark_sql = "SELECT DATE_ADD(CURRENT_DATE(), 7)"

48

postgres_sql = sqlglot.transpile(spark_sql, read="spark", write="postgres")[0]

49

# Result: "SELECT (CURRENT_DATE + INTERVAL '7' DAY)"

50

51

# Build SQL programmatically

52

query = (

53

sqlglot.select("user_id", sqlglot.func("COUNT", "*").as_("total"))

54

.from_("users")

55

.where(sqlglot.column("age") > 21)

56

.group_by("user_id")

57

)

58

print(query.sql()) # SELECT user_id, COUNT(*) AS total FROM users WHERE age > 21 GROUP BY user_id

59

60

# Format SQL with pretty printing

61

formatted = sqlglot.transpile(sql, pretty=True)[0]

62

```

63

64

## Architecture

65

66

SQLGlot's architecture is built around four core components:

67

68

- **Tokenizer**: Lexical analysis that converts SQL text into tokens, with support for dialect-specific keywords and syntax

69

- **Parser**: Recursive descent parser that builds Abstract Syntax Trees (ASTs) from tokens, handling dialect-specific grammar rules

70

- **Generator**: Converts ASTs back into SQL text with dialect-specific formatting and syntax rules

71

- **Expression System**: Rich object model representing all SQL constructs as Python objects with manipulation methods

72

73

This design enables SQLGlot to serve as both a high-level transpilation tool and a low-level SQL manipulation library, supporting everything from simple format conversion to complex query optimization and analysis.

74

75

## Capabilities

76

77

### Core Parsing and Transpilation

78

79

Essential SQL parsing and dialect translation functionality. Parse SQL strings into abstract syntax trees, transpile between 30+ SQL dialects, and tokenize SQL for lexical analysis.

80

81

```python { .api }

82

def parse(sql: str, read: str = None, dialect: str = None, **opts) -> List[Optional[Expression]]: ...

83

def parse_one(sql: str, read: str = None, dialect: str = None, into: Optional[Type] = None, **opts) -> Expression: ...

84

def transpile(sql: str, read: str = None, write: str = None, identity: bool = True, error_level: Optional[ErrorLevel] = None, **opts) -> List[str]: ...

85

def tokenize(sql: str, read: str = None, dialect: str = None) -> List[Token]: ...

86

```

87

88

[Core Parsing and Transpilation](./core-parsing.md)

89

90

### Expression Building and AST Manipulation

91

92

Programmatic SQL construction using builder functions and direct AST manipulation. Create complex SQL queries through code, modify existing parsed queries, and traverse expression trees.

93

94

```python { .api }

95

def select(*expressions) -> Select: ...

96

def column(col: str) -> Column: ...

97

def table_(name: str) -> Table: ...

98

def func(name: str, *args) -> Function: ...

99

def and_(*conditions) -> And: ...

100

def or_(*conditions) -> Or: ...

101

def case() -> Case: ...

102

def cast(expression, to: str) -> Cast: ...

103

```

104

105

[Expression Building](./expression-building.md)

106

107

### SQL Dialects

108

109

Support for 30+ SQL dialects with dialect-specific parsing, generation, and transformation rules. Each dialect handles unique syntax, functions, and data types.

110

111

```python { .api }

112

class Dialect:

113

def parse(self, sql: str, **opts) -> List[Optional[Expression]]: ...

114

def generate(self, expression: Expression, **opts) -> str: ...

115

def tokenize(self, sql: str) -> List[Token]: ...

116

117

# Available dialects

118

class BigQuery(Dialect): ...

119

class Snowflake(Dialect): ...

120

class Spark(Dialect): ...

121

class DuckDB(Dialect): ...

122

# ... and 26 more

123

```

124

125

[SQL Dialects](./dialects.md)

126

127

### SQL Optimization

128

129

Comprehensive query optimization engine with 15+ optimization rules. Optimize SQL for performance, simplify expressions, and analyze query structure.

130

131

```python { .api }

132

def optimize(sql: str | Expression, schema: Optional[Schema] = None, **opts) -> Expression: ...

133

134

# Optimization rules and components

135

def annotate_types(expression: Expression, **opts) -> Expression: ...

136

def qualify_columns(expression: Expression, schema: Schema, **opts) -> Expression: ...

137

def pushdown_predicates(expression: Expression, **opts) -> Expression: ...

138

def eliminate_subqueries(expression: Expression, **opts) -> Expression: ...

139

```

140

141

[SQL Optimization](./optimization.md)

142

143

### SQL Execution Engine

144

145

Built-in SQL execution engine for running queries against in-memory data structures. Execute SQL against Python data, perform joins and aggregations, and integrate with pandas DataFrames.

146

147

```python { .api }

148

def execute(sql: str | Expression, schema: Optional[Schema] = None, read: str = None, dialect: str = None, tables: Optional[Dict] = None) -> Table: ...

149

150

class Table:

151

def __init__(self, columns: List[str], rows: List[List]): ...

152

def to_dict(self) -> Dict: ...

153

```

154

155

[SQL Execution](./execution.md)

156

157

### Schema Management

158

159

Database schema representation and validation for accurate parsing and optimization. Define table structures, column types, and relationships.

160

161

```python { .api }

162

class Schema:

163

def add_table(self, table: str, column_mapping: Dict = None, **opts) -> None: ...

164

def column_names(self, table: str, **opts) -> List[str]: ...

165

def get_column_type(self, table: str, column: str, **opts) -> Optional[str]: ...

166

167

class MappingSchema(Schema):

168

def __init__(self, schema: Dict = None, **opts): ...

169

```

170

171

[Schema Management](./schema.md)

172

173

### Utility Functions

174

175

Additional tools for SQL analysis, comparison, and manipulation including AST diffing, column lineage analysis, and time parsing utilities.

176

177

```python { .api }

178

def diff(source: Expression, target: Expression, **opts) -> str: ...

179

180

# Lineage analysis

181

def lineage(sql: str, schema: Optional[Schema] = None, **opts) -> Dict: ...

182

183

# Error classes

184

class ParseError(Exception): ...

185

class UnsupportedError(Exception): ...

186

class ErrorLevel: ...

187

```

188

189

[Utilities](./utilities.md)

190

191

## Types

192

193

```python { .api }

194

class Expression:

195

"""Base class for all SQL expressions with AST manipulation methods."""

196

def sql(self, dialect: str = None, **opts) -> str: ...

197

def transform(self, fun: Callable, **opts) -> Expression: ...

198

def find(self, expression_type: Type, **opts) -> Optional[Expression]: ...

199

def find_all(self, expression_type: Type, **opts) -> List[Expression]: ...

200

def replace(self, expression: Expression, **opts) -> Expression: ...

201

202

class Token:

203

"""Represents a lexical token from SQL tokenization."""

204

def __init__(self, token_type: TokenType, text: str, line: int = 1, col: int = 1): ...

205

206

class ErrorLevel:

207

"""Error handling levels for parsing and validation."""

208

IGNORE: str

209

WARN: str

210

RAISE: str

211

IMMEDIATE: str

212

```