tessl/pypi-simple-ddl-parser

Simple DDL Parser to parse SQL & dialects like HQL, TSQL (MSSQL), Oracle, AWS Redshift, Snowflake, MySQL, PostgreSQL, etc ddl files to json/python dict with full information about columns: types, defaults, primary keys, etc.; sequences, alters, custom types & other entities from ddl.

—

Pending

Overview

Eval results

Files

Core Parsing

Name: tessl/pypi-simple-ddl-parser
Author: tessl

Primary DDL parsing functionality that converts SQL DDL statements into structured dictionaries. The parser handles tables, columns, constraints, indexes, sequences, and other database objects with comprehensive metadata extraction.

from simple_ddl_parser import DDLParser, parse_from_file
from typing import Optional, Union, List, Dict
import logging

Capabilities

DDLParser Class

Main parser class that provides comprehensive DDL parsing with configurable options for error handling, debugging, name normalization, and logging.

class DDLParser:
    def __init__(
        self,
        content: str,
        silent: bool = True,
        debug: bool = False,
        normalize_names: bool = False,
        log_file: Optional[str] = None,
        log_level: Union[str, int] = logging.INFO
    ):
        """
        Initialize DDL parser with content and configuration options.
        
        Parameters:
        - content (str): DDL content to parse
        - silent (bool): If True, suppress exception raising on parse errors
        - debug (bool): Enable debug output during parsing
        - normalize_names (bool): Remove identifier delimiters (quotes, brackets)
        - log_file (Optional[str]): Path to log file for debug output
        - log_level (Union[str, int]): Logging level (DEBUG, INFO, WARNING, ERROR)
        """

Parser Execution

Execute DDL parsing with various output options and formatting controls.

def run(
    self,
    *,
    dump: bool = False,
    dump_path: str = "schemas",
    file_path: Optional[str] = None,
    output_mode: str = "sql",
    group_by_type: bool = False,
    json_dump: bool = False
) -> List[Dict]:
    """
    Parse DDL content and return structured metadata.
    
    Parameters:
    - dump (bool): Save output to JSON files in dump_path directory
    - dump_path (str): Directory path for saved output files
    - file_path (Optional[str]): Original file path for reference
    - output_mode (str): Output dialect ("sql", "mysql", "postgres", etc.)
    - group_by_type (bool): Group results by entity types (tables, sequences, etc.)
    - json_dump (bool): Return JSON string instead of Python dict
    
    Returns:
    List[Dict]: List of parsed database objects with metadata
    """

File Parsing Function

Convenience function for parsing DDL directly from files with automatic encoding handling.

def parse_from_file(
    file_path: str,
    encoding: Optional[str] = "utf-8",
    parser_settings: Optional[dict] = None,
    **kwargs
) -> List[Dict]:
    """
    Parse DDL content from a file.
    
    Parameters:
    - file_path (str): Path to DDL file to parse
    - encoding (Optional[str]): File encoding (default: utf-8)
    - parser_settings (Optional[dict]): DDLParser constructor arguments
    - **kwargs: Additional arguments passed to parser.run()
    
    Returns:
    List[Dict]: List of parsed database objects with metadata
    """

Usage Examples

Basic Table Parsing

from simple_ddl_parser import DDLParser

ddl = """
CREATE TABLE employees (
    id SERIAL PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(100) UNIQUE,
    department_id INTEGER REFERENCES departments(id),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
"""

parser = DDLParser(ddl)
result = parser.run()
print(result[0]['columns'])  # Column definitions with types and constraints

Multi-table Schema Parsing

from simple_ddl_parser import parse_from_file

# Parse complex schema file
tables = parse_from_file('complete_schema.sql', encoding='utf-8')

# Access parsed metadata
for table in tables:
    print(f"Table: {table['table_name']}")
    for column in table['columns']:
        print(f"  Column: {column['name']} ({column['type']})")

Advanced Configuration

from simple_ddl_parser import DDLParser

# Configure parser with custom settings
parser = DDLParser(
    ddl_content,
    silent=False,  # Raise exceptions on errors
    debug=True,    # Enable debug output
    normalize_names=True,  # Remove identifier quotes
    log_file='parser.log',
    log_level='DEBUG'
)

# Parse with specific output format
result = parser.run(
    output_mode='postgres',
    group_by_type=True,
    json_dump=False
)

Parsed Structure Format

The parser returns a list of dictionaries, each representing a database object:

# Example parsed table structure
{
    "table_name": "employees",
    "schema": "public",
    "columns": [
        {
            "name": "id",
            "type": "SERIAL",
            "primary_key": True,
            "nullable": False
        },
        {
            "name": "first_name", 
            "type": "VARCHAR",
            "size": 50,
            "nullable": False
        }
    ],
    "primary_key": ["id"],
    "foreign_keys": [
        {
            "columns": ["department_id"],
            "references": {
                "table": "departments",
                "columns": ["id"]
            }
        }
    ],
    "indexes": [],
    "constraints": []
}

Install with Tessl CLI

npx tessl i tessl/pypi-simple-ddl-parser

docs