0
# Input Dialects
1
2
Specialized parsing logic for different SQL database dialects, handling syntax variations and database-specific features like data types, functions, and DDL extensions. Each dialect class extends the base SQL parser with platform-specific rules and token recognition.
3
4
## Capabilities
5
6
### Base SQL Dialect
7
8
Foundation dialect class that provides common SQL DDL parsing functionality shared across all database platforms.
9
10
```python { .api }
11
class BaseSQL:
12
"""
13
Base SQL dialect providing common DDL parsing functionality.
14
Handles standard SQL DDL statements, data types, and constraints.
15
"""
16
```
17
18
### MySQL Dialect
19
20
MySQL-specific parsing rules including AUTO_INCREMENT, MySQL data types, storage engines, and MySQL-specific syntax extensions.
21
22
```python { .api }
23
class MySQL:
24
"""
25
MySQL dialect parser handling MySQL-specific syntax including:
26
- AUTO_INCREMENT columns
27
- MySQL data types (TINYINT, MEDIUMINT, etc.)
28
- Storage engine specifications (ENGINE=InnoDB)
29
- MySQL-specific constraints and options
30
"""
31
```
32
33
### PostgreSQL Dialect
34
35
PostgreSQL-specific parsing for SERIAL types, arrays, custom data types, and PostgreSQL DDL extensions.
36
37
```python { .api }
38
class PSQL:
39
"""
40
PostgreSQL dialect parser handling PostgreSQL-specific syntax including:
41
- SERIAL and BIGSERIAL auto-increment types
42
- Array data types (INTEGER[], VARCHAR[] etc.)
43
- PostgreSQL-specific functions and operators
44
- Custom domain types and enums
45
"""
46
```
47
48
### Microsoft SQL Server Dialect
49
50
MSSQL/T-SQL parsing for SQL Server data types, IDENTITY columns, and SQL Server-specific DDL syntax.
51
52
```python { .api }
53
class MSSQL:
54
"""
55
Microsoft SQL Server dialect parser handling T-SQL syntax including:
56
- IDENTITY columns for auto-increment
57
- SQL Server data types (NVARCHAR, DATETIME2, etc.)
58
- SQL Server-specific constraints and options
59
- T-SQL specific DDL extensions
60
"""
61
```
62
63
### Oracle Dialect
64
65
Oracle-specific parsing for Oracle data types, sequences, and Oracle DDL syntax variations.
66
67
```python { .api }
68
class Oracle:
69
"""
70
Oracle dialect parser handling Oracle-specific syntax including:
71
- Oracle data types (NUMBER, VARCHAR2, CLOB, etc.)
72
- Oracle sequences and triggers
73
- Oracle-specific constraints and storage options
74
- Oracle DDL syntax variations
75
"""
76
```
77
78
### Hive Query Language Dialect
79
80
HQL parsing for Hadoop/Hive table definitions, partitioning, and big data specific DDL constructs.
81
82
```python { .api }
83
class HQL:
84
"""
85
Hive Query Language dialect parser handling HQL syntax including:
86
- Hive table partitioning (PARTITIONED BY)
87
- Hive storage formats (STORED AS, ROW FORMAT)
88
- Hive-specific data types and functions
89
- External table definitions
90
"""
91
```
92
93
### Google BigQuery Dialect
94
95
BigQuery-specific parsing for BigQuery data types, clustering, partitioning, and Google Cloud SQL syntax.
96
97
```python { .api }
98
class BigQuery:
99
"""
100
Google BigQuery dialect parser handling BigQuery syntax including:
101
- BigQuery data types (ARRAY, STRUCT, GEOGRAPHY, etc.)
102
- Table clustering and partitioning
103
- BigQuery-specific functions and operators
104
- Dataset and project references
105
"""
106
```
107
108
### AWS Redshift Dialect
109
110
Redshift-specific parsing for Redshift data types, distribution keys, sort keys, and AWS-specific DDL features.
111
112
```python { .api }
113
class Redshift:
114
"""
115
AWS Redshift dialect parser handling Redshift syntax including:
116
- Redshift data types and encodings
117
- Distribution keys (DISTKEY) and sort keys (SORTKEY)
118
- Redshift-specific table properties
119
- Redshift compression and storage options
120
"""
121
```
122
123
### Snowflake Dialect
124
125
Snowflake-specific parsing for Snowflake data types, clustering keys, and Snowflake DDL syntax.
126
127
```python { .api }
128
class Snowflake:
129
"""
130
Snowflake dialect parser handling Snowflake syntax including:
131
- Snowflake data types (VARIANT, OBJECT, ARRAY)
132
- Clustering keys and micro-partitions
133
- Snowflake-specific table properties
134
- Time travel and data sharing syntax
135
"""
136
```
137
138
### Apache Spark SQL Dialect
139
140
Spark SQL parsing for Spark-specific data types, partitioning, and distributed table definitions.
141
142
```python { .api }
143
class SparkSQL:
144
"""
145
Apache Spark SQL dialect parser handling Spark syntax including:
146
- Spark SQL data types and functions
147
- Table partitioning and bucketing
148
- Spark-specific storage formats
149
- Delta Lake and Iceberg table syntax
150
"""
151
```
152
153
### IBM DB2 Dialect
154
155
DB2-specific parsing for DB2 data types, tablespaces, and IBM-specific DDL syntax features.
156
157
```python { .api }
158
class IBMDb2:
159
"""
160
IBM DB2 dialect parser handling DB2 syntax including:
161
- DB2 data types and functions
162
- Tablespace definitions
163
- DB2-specific constraints and options
164
- IBM-specific DDL extensions
165
"""
166
```
167
168
### AWS Athena Dialect
169
170
Athena-specific parsing for Athena/Presto SQL syntax, external tables, and AWS Glue catalog integration.
171
172
```python { .api }
173
class Athena:
174
"""
175
AWS Athena dialect parser handling Athena syntax including:
176
- Athena/Presto data types and functions
177
- External table definitions with S3 locations
178
- Partition projection and storage formats
179
- AWS Glue catalog integration syntax
180
"""
181
```
182
183
## Usage Examples
184
185
### Automatic Dialect Detection
186
187
The parser automatically applies appropriate dialect rules based on DDL syntax:
188
189
```python
190
from simple_ddl_parser import DDLParser
191
192
# MySQL-specific syntax is automatically recognized
193
mysql_ddl = """
194
CREATE TABLE users (
195
id INT AUTO_INCREMENT PRIMARY KEY,
196
name VARCHAR(255) NOT NULL
197
) ENGINE=InnoDB;
198
"""
199
200
parser = DDLParser(mysql_ddl)
201
result = parser.run()
202
# Parser automatically uses MySQL dialect rules
203
```
204
205
### PostgreSQL Array Types
206
207
```python
208
# PostgreSQL arrays are parsed correctly
209
postgres_ddl = """
210
CREATE TABLE products (
211
id SERIAL PRIMARY KEY,
212
tags VARCHAR(50)[],
213
prices DECIMAL(10,2)[]
214
);
215
"""
216
217
parser = DDLParser(postgres_ddl)
218
result = parser.run()
219
# Array types are properly recognized and parsed
220
```
221
222
### BigQuery Nested Types
223
224
```python
225
# BigQuery STRUCT and ARRAY types
226
bigquery_ddl = """
227
CREATE TABLE analytics.events (
228
event_id STRING,
229
user_data STRUCT<
230
name STRING,
231
age INT64,
232
preferences ARRAY<STRING>
233
>
234
);
235
"""
236
237
parser = DDLParser(bigquery_ddl)
238
result = parser.run()
239
# Complex nested types are parsed with full structure
240
```
241
242
## Dialect-Specific Features
243
244
Each dialect parser recognizes and handles:
245
246
- **Data Types**: Platform-specific data types and their variations
247
- **Auto-increment**: Different auto-increment syntax (AUTO_INCREMENT, SERIAL, IDENTITY)
248
- **Constraints**: Platform-specific constraint syntax and options
249
- **Storage Options**: Engine specifications, tablespaces, compression
250
- **Functions**: Database-specific functions in defaults and computed columns
251
- **Extensions**: Platform-specific DDL extensions and syntax variations