or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli.mdcore-parsing.mdexceptions.mdindex.mdinput-dialects.mdoutput-formatting.md

input-dialects.mddocs/

0

# Input Dialects

1

2

Specialized parsing logic for different SQL database dialects, handling syntax variations and database-specific features like data types, functions, and DDL extensions. Each dialect class extends the base SQL parser with platform-specific rules and token recognition.

3

4

## Capabilities

5

6

### Base SQL Dialect

7

8

Foundation dialect class that provides common SQL DDL parsing functionality shared across all database platforms.

9

10

```python { .api }

11

class BaseSQL:

12

"""

13

Base SQL dialect providing common DDL parsing functionality.

14

Handles standard SQL DDL statements, data types, and constraints.

15

"""

16

```

17

18

### MySQL Dialect

19

20

MySQL-specific parsing rules including AUTO_INCREMENT, MySQL data types, storage engines, and MySQL-specific syntax extensions.

21

22

```python { .api }

23

class MySQL:

24

"""

25

MySQL dialect parser handling MySQL-specific syntax including:

26

- AUTO_INCREMENT columns

27

- MySQL data types (TINYINT, MEDIUMINT, etc.)

28

- Storage engine specifications (ENGINE=InnoDB)

29

- MySQL-specific constraints and options

30

"""

31

```

32

33

### PostgreSQL Dialect

34

35

PostgreSQL-specific parsing for SERIAL types, arrays, custom data types, and PostgreSQL DDL extensions.

36

37

```python { .api }

38

class PSQL:

39

"""

40

PostgreSQL dialect parser handling PostgreSQL-specific syntax including:

41

- SERIAL and BIGSERIAL auto-increment types

42

- Array data types (INTEGER[], VARCHAR[] etc.)

43

- PostgreSQL-specific functions and operators

44

- Custom domain types and enums

45

"""

46

```

47

48

### Microsoft SQL Server Dialect

49

50

MSSQL/T-SQL parsing for SQL Server data types, IDENTITY columns, and SQL Server-specific DDL syntax.

51

52

```python { .api }

53

class MSSQL:

54

"""

55

Microsoft SQL Server dialect parser handling T-SQL syntax including:

56

- IDENTITY columns for auto-increment

57

- SQL Server data types (NVARCHAR, DATETIME2, etc.)

58

- SQL Server-specific constraints and options

59

- T-SQL specific DDL extensions

60

"""

61

```

62

63

### Oracle Dialect

64

65

Oracle-specific parsing for Oracle data types, sequences, and Oracle DDL syntax variations.

66

67

```python { .api }

68

class Oracle:

69

"""

70

Oracle dialect parser handling Oracle-specific syntax including:

71

- Oracle data types (NUMBER, VARCHAR2, CLOB, etc.)

72

- Oracle sequences and triggers

73

- Oracle-specific constraints and storage options

74

- Oracle DDL syntax variations

75

"""

76

```

77

78

### Hive Query Language Dialect

79

80

HQL parsing for Hadoop/Hive table definitions, partitioning, and big data specific DDL constructs.

81

82

```python { .api }

83

class HQL:

84

"""

85

Hive Query Language dialect parser handling HQL syntax including:

86

- Hive table partitioning (PARTITIONED BY)

87

- Hive storage formats (STORED AS, ROW FORMAT)

88

- Hive-specific data types and functions

89

- External table definitions

90

"""

91

```

92

93

### Google BigQuery Dialect

94

95

BigQuery-specific parsing for BigQuery data types, clustering, partitioning, and Google Cloud SQL syntax.

96

97

```python { .api }

98

class BigQuery:

99

"""

100

Google BigQuery dialect parser handling BigQuery syntax including:

101

- BigQuery data types (ARRAY, STRUCT, GEOGRAPHY, etc.)

102

- Table clustering and partitioning

103

- BigQuery-specific functions and operators

104

- Dataset and project references

105

"""

106

```

107

108

### AWS Redshift Dialect

109

110

Redshift-specific parsing for Redshift data types, distribution keys, sort keys, and AWS-specific DDL features.

111

112

```python { .api }

113

class Redshift:

114

"""

115

AWS Redshift dialect parser handling Redshift syntax including:

116

- Redshift data types and encodings

117

- Distribution keys (DISTKEY) and sort keys (SORTKEY)

118

- Redshift-specific table properties

119

- Redshift compression and storage options

120

"""

121

```

122

123

### Snowflake Dialect

124

125

Snowflake-specific parsing for Snowflake data types, clustering keys, and Snowflake DDL syntax.

126

127

```python { .api }

128

class Snowflake:

129

"""

130

Snowflake dialect parser handling Snowflake syntax including:

131

- Snowflake data types (VARIANT, OBJECT, ARRAY)

132

- Clustering keys and micro-partitions

133

- Snowflake-specific table properties

134

- Time travel and data sharing syntax

135

"""

136

```

137

138

### Apache Spark SQL Dialect

139

140

Spark SQL parsing for Spark-specific data types, partitioning, and distributed table definitions.

141

142

```python { .api }

143

class SparkSQL:

144

"""

145

Apache Spark SQL dialect parser handling Spark syntax including:

146

- Spark SQL data types and functions

147

- Table partitioning and bucketing

148

- Spark-specific storage formats

149

- Delta Lake and Iceberg table syntax

150

"""

151

```

152

153

### IBM DB2 Dialect

154

155

DB2-specific parsing for DB2 data types, tablespaces, and IBM-specific DDL syntax features.

156

157

```python { .api }

158

class IBMDb2:

159

"""

160

IBM DB2 dialect parser handling DB2 syntax including:

161

- DB2 data types and functions

162

- Tablespace definitions

163

- DB2-specific constraints and options

164

- IBM-specific DDL extensions

165

"""

166

```

167

168

### AWS Athena Dialect

169

170

Athena-specific parsing for Athena/Presto SQL syntax, external tables, and AWS Glue catalog integration.

171

172

```python { .api }

173

class Athena:

174

"""

175

AWS Athena dialect parser handling Athena syntax including:

176

- Athena/Presto data types and functions

177

- External table definitions with S3 locations

178

- Partition projection and storage formats

179

- AWS Glue catalog integration syntax

180

"""

181

```

182

183

## Usage Examples

184

185

### Automatic Dialect Detection

186

187

The parser automatically applies appropriate dialect rules based on DDL syntax:

188

189

```python

190

from simple_ddl_parser import DDLParser

191

192

# MySQL-specific syntax is automatically recognized

193

mysql_ddl = """

194

CREATE TABLE users (

195

id INT AUTO_INCREMENT PRIMARY KEY,

196

name VARCHAR(255) NOT NULL

197

) ENGINE=InnoDB;

198

"""

199

200

parser = DDLParser(mysql_ddl)

201

result = parser.run()

202

# Parser automatically uses MySQL dialect rules

203

```

204

205

### PostgreSQL Array Types

206

207

```python

208

# PostgreSQL arrays are parsed correctly

209

postgres_ddl = """

210

CREATE TABLE products (

211

id SERIAL PRIMARY KEY,

212

tags VARCHAR(50)[],

213

prices DECIMAL(10,2)[]

214

);

215

"""

216

217

parser = DDLParser(postgres_ddl)

218

result = parser.run()

219

# Array types are properly recognized and parsed

220

```

221

222

### BigQuery Nested Types

223

224

```python

225

# BigQuery STRUCT and ARRAY types

226

bigquery_ddl = """

227

CREATE TABLE analytics.events (

228

event_id STRING,

229

user_data STRUCT<

230

name STRING,

231

age INT64,

232

preferences ARRAY<STRING>

233

>

234

);

235

"""

236

237

parser = DDLParser(bigquery_ddl)

238

result = parser.run()

239

# Complex nested types are parsed with full structure

240

```

241

242

## Dialect-Specific Features

243

244

Each dialect parser recognizes and handles:

245

246

- **Data Types**: Platform-specific data types and their variations

247

- **Auto-increment**: Different auto-increment syntax (AUTO_INCREMENT, SERIAL, IDENTITY)

248

- **Constraints**: Platform-specific constraint syntax and options

249

- **Storage Options**: Engine specifications, tablespaces, compression

250

- **Functions**: Database-specific functions in defaults and computed columns

251

- **Extensions**: Platform-specific DDL extensions and syntax variations