or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-parsing.mddialects.mdexecution.mdexpression-building.mdindex.mdoptimization.mdschema.mdutilities.md

core-parsing.mddocs/

0

# Core Parsing and Transpilation

1

2

Essential SQL parsing and dialect translation functionality that forms the foundation of SQLGlot. These functions handle tokenization, parsing SQL into abstract syntax trees, and transpiling between different SQL dialects.

3

4

## Capabilities

5

6

### SQL Parsing

7

8

Parse SQL strings into abstract syntax trees (ASTs) for analysis and manipulation. Supports parsing multiple statements and handling various SQL dialects.

9

10

```python { .api }

11

def parse(sql: str, read: str = None, dialect: str = None, **opts) -> List[Optional[Expression]]:

12

"""

13

Parses SQL string into collection of syntax trees, one per statement.

14

15

Args:

16

sql (str): SQL code string to parse

17

read (str): SQL dialect for parsing (e.g., "spark", "hive", "presto", "mysql")

18

dialect (str): SQL dialect (alias for read)

19

**opts: Additional parser options

20

21

Returns:

22

List[Optional[Expression]]: Collection of parsed expression trees

23

"""

24

```

25

26

### Single Statement Parsing

27

28

Parse a single SQL statement into an expression tree. Most commonly used parsing function for single queries.

29

30

```python { .api }

31

def parse_one(sql: str, read: str = None, dialect: str = None, into: Optional[Type] = None, **opts) -> Expression:

32

"""

33

Parses SQL string and returns syntax tree for the first statement.

34

35

Args:

36

sql (str): SQL code string to parse

37

read (str): SQL dialect for parsing

38

dialect (str): SQL dialect (alias for read)

39

into (Type): Specific SQLGlot Expression type to parse into

40

**opts: Additional parser options

41

42

Returns:

43

Expression: Syntax tree for the first parsed statement

44

45

Raises:

46

ParseError: If no valid expression could be parsed

47

"""

48

```

49

50

### SQL Transpilation

51

52

Convert SQL between different dialects while preserving semantic meaning. Handles dialect-specific syntax, functions, and data types.

53

54

```python { .api }

55

def transpile(

56

sql: str,

57

read: str = None,

58

write: str = None,

59

identity: bool = True,

60

error_level: Optional[ErrorLevel] = None,

61

**opts

62

) -> List[str]:

63

"""

64

Transpiles SQL from source dialect to target dialect.

65

66

Args:

67

sql (str): SQL code string to transpile

68

read (str): Source dialect (e.g., "spark", "hive", "presto", "mysql")

69

write (str): Target dialect (e.g., "postgres", "bigquery", "snowflake")

70

identity (bool): Use source dialect as target if write not specified

71

error_level (ErrorLevel): Desired error handling level

72

**opts: Additional generator options for output formatting

73

74

Returns:

75

List[str]: List of transpiled SQL statements

76

"""

77

```

78

79

### SQL Tokenization

80

81

Break SQL strings into lexical tokens for low-level analysis and custom processing.

82

83

```python { .api }

84

def tokenize(sql: str, read: str = None, dialect: str = None) -> List[Token]:

85

"""

86

Tokenizes SQL string into list of lexical tokens.

87

88

Args:

89

sql (str): SQL code string to tokenize

90

read (str): SQL dialect for tokenization

91

dialect (str): SQL dialect (alias for read)

92

93

Returns:

94

List[Token]: List of tokens representing the SQL input

95

"""

96

```

97

98

### Utility Functions

99

100

Additional parsing utilities for expression handling and analysis.

101

102

```python { .api }

103

def maybe_parse(sql: str | Expression, **opts) -> Expression:

104

"""

105

Parses SQL string or returns Expression if already parsed.

106

107

Args:

108

sql: SQL string or Expression object

109

**opts: Parse options if parsing needed

110

111

Returns:

112

Expression: Parsed or existing expression

113

"""

114

115

def diff(source: Expression, target: Expression, **opts) -> str:

116

"""

117

Compares two SQL expressions and returns a diff string.

118

119

Args:

120

source (Expression): Source expression to compare

121

target (Expression): Target expression to compare against

122

**opts: Additional diff options

123

124

Returns:

125

str: String representation of differences between expressions

126

"""

127

```

128

129

## Usage Examples

130

131

### Basic Parsing

132

133

```python

134

import sqlglot

135

136

# Parse a simple SELECT statement

137

sql = "SELECT name, age FROM users WHERE age > 25"

138

expression = sqlglot.parse_one(sql)

139

140

# Parse with specific dialect

141

spark_sql = "SELECT explode(array_col) FROM table"

142

expression = sqlglot.parse_one(spark_sql, dialect="spark")

143

144

# Parse multiple statements

145

multi_sql = "SELECT 1; SELECT 2; SELECT 3;"

146

expressions = sqlglot.parse(multi_sql)

147

```

148

149

### Dialect Transpilation

150

151

```python

152

import sqlglot

153

154

# Convert Spark SQL to PostgreSQL

155

spark_query = "SELECT DATE_ADD(current_date(), 7) as future_date"

156

postgres_query = sqlglot.transpile(spark_query, read="spark", write="postgres")[0]

157

# Result: "SELECT (CURRENT_DATE + INTERVAL '7' DAY) AS future_date"

158

159

# Convert BigQuery to Snowflake

160

bq_query = "SELECT EXTRACT(YEAR FROM date_col) FROM table"

161

sf_query = sqlglot.transpile(bq_query, read="bigquery", write="snowflake")[0]

162

163

# Format SQL with pretty printing

164

formatted = sqlglot.transpile(

165

"SELECT a,b,c FROM table WHERE x=1 AND y=2",

166

pretty=True

167

)[0]

168

```

169

170

### Working with Tokens

171

172

```python

173

import sqlglot

174

175

sql = "SELECT * FROM users"

176

tokens = sqlglot.tokenize(sql)

177

178

for token in tokens:

179

print(f"{token.token_type}: {token.text}")

180

# Output:

181

# TokenType.SELECT: SELECT

182

# TokenType.STAR: *

183

# TokenType.FROM: FROM

184

# TokenType.IDENTIFIER: users

185

```

186

187

### Error Handling

188

189

```python

190

import sqlglot

191

from sqlglot import ParseError, ErrorLevel

192

193

# Handle parsing errors

194

try:

195

expression = sqlglot.parse_one("SELECT FROM") # Invalid SQL

196

except ParseError as e:

197

print(f"Parse error: {e}")

198

199

# Control error level

200

expressions = sqlglot.parse(

201

"SELECT 1; INVALID SQL; SELECT 2",

202

error_level=ErrorLevel.WARN # Log errors but continue

203

)

204

```

205

206

## Types

207

208

```python { .api }

209

class Token:

210

"""Represents a lexical token from SQL tokenization."""

211

token_type: TokenType

212

text: str

213

line: int

214

col: int

215

216

def __init__(self, token_type: TokenType, text: str, line: int = 1, col: int = 1): ...

217

218

class TokenType:

219

"""Enumeration of all possible token types in SQL."""

220

# Keywords

221

SELECT: str

222

FROM: str

223

WHERE: str

224

# Operators

225

PLUS: str

226

MINUS: str

227

STAR: str

228

# Literals

229

STRING: str

230

NUMBER: str

231

# ... and many more

232

233

class ErrorLevel:

234

"""Error handling levels for parsing operations."""

235

IGNORE: str # Ignore all errors

236

WARN: str # Log errors but continue

237

RAISE: str # Collect errors and raise single exception

238

IMMEDIATE: str # Raise exception on first error

239

```