tessl/pypi-superset

A modern, enterprise-ready business intelligence web application

—

Pending

Overview

Eval results

Files

SQL Lab

Name: tessl/pypi-superset
Author: tessl

Interactive SQL editor with syntax highlighting, auto-completion, query history, and asynchronous execution capabilities. Provides comprehensive SQL development environment with result management, saved queries, and collaborative features for data exploration and analysis.

Capabilities

Query Execution

Core SQL execution engine with synchronous and asynchronous processing capabilities.

def get_sql_results(ctask, query_id, rendered_query, return_results=True, store_results=False, user_name=None, start_time=None):
    """
    Celery task for asynchronous SQL execution.
    Main entry point for background query processing with comprehensive
    error handling, progress tracking, and result storage.
    
    Parameters:
    - ctask: Celery task instance for progress updates
    - query_id: str, unique identifier for query tracking
    - rendered_query: str, final SQL text after template processing
    - return_results: bool, whether to return data in response
    - store_results: bool, whether to persist results for large datasets
    - user_name: str, optional username for audit logging
    - start_time: datetime, query initiation timestamp
    
    Returns:
    dict with query results, metadata, and execution statistics
    """

def execute_sql(ctask, query_id, rendered_query, return_results, store_results, user_name, session, start_time):
    """
    Core SQL execution logic with database interaction.
    Handles query parsing, security validation, execution,
    and result processing with comprehensive error handling.
    
    Parameters:
    - ctask: Celery task instance
    - query_id: str, query identifier
    - rendered_query: str, executable SQL statement  
    - return_results: bool, include data in response
    - store_results: bool, persist results to backend storage
    - user_name: str, executing user identification
    - session: SQLAlchemy session for database operations
    - start_time: datetime, execution start timestamp
    
    Returns:
    dict with execution results, query metadata, and performance metrics
    """

def has_table_query(sql):
    """
    Check if query creates or modifies tables.
    Analyzes SQL for DDL operations requiring special permissions.
    
    Parameters:
    - sql: str, SQL query text to analyze
    
    Returns:
    bool, True if query contains table creation/modification operations
    """

def get_query(query_id, session, retry_count=5):
    """
    Retrieve query object with retry logic.
    Handles concurrent access and temporary database unavailability.
    
    Parameters:
    - query_id: str, query identifier to retrieve
    - session: SQLAlchemy session for database access
    - retry_count: int, maximum retry attempts for retrieval
    
    Returns:
    Query object or None if not found after retries
    """

def session_scope(nullpool):
    """
    Context manager for database session management.
    Ensures proper session cleanup and transaction handling.
    
    Parameters:
    - nullpool: bool, whether to use null connection pooling
    
    Yields:
    SQLAlchemy session configured for query operations
    """

SQL Lab Views

Web interface endpoints for SQL Lab functionality and user interaction.

class SqlLabView:
    """
    SQL Lab web interface controller.
    Provides REST API endpoints for SQL development environment
    including query execution, result management, and metadata browsing.
    """
    
    def index(self):
        """
        SQL Lab main interface.
        
        Returns:
        HTML interface for SQL Lab with editor, results, and navigation
        """
    
    def format_sql(self):
        """
        Format SQL query text.
        Applies standard formatting rules for improved readability.
        
        HTTP Method: POST
        
        Request Body:
        - sql: str, SQL text to format
        
        Returns:
        JSON response with formatted SQL text
        """
    
    def runsql(self):
        """
        Execute SQL query synchronously.
        Immediate execution for small queries with direct result return.
        
        HTTP Method: POST
        
        Request Body:
        - database_id: int, target database identifier
        - sql: str, SQL query text to execute
        - schema: str, optional schema context
        - tmp_table_name: str, optional temporary table name
        - select_as_cta: bool, CREATE TABLE AS SELECT flag
        - client_id: str, client session identifier
        
        Returns:
        JSON response with query results and execution metadata
        """
    
    def schemas(self):
        """
        List database schemas.
        
        HTTP Method: GET
        
        Query Parameters:
        - database_id: int, database identifier
        
        Returns:
        JSON array of available schema names
        """
    
    def tables(self, database_id, schema):
        """
        List schema tables.
        
        Parameters:
        - database_id: int, database identifier
        - schema: str, schema name to explore
        
        Returns:
        JSON array of table names in specified schema
        """
    
    def table(self, database_id, schema, table):
        """
        Get table metadata and column information.
        
        Parameters:  
        - database_id: int, database identifier
        - schema: str, schema name
        - table: str, table name
        
        Returns:
        JSON object with table metadata, columns, and constraints
        """
    
    def select_star(self, database_id, schema, table):
        """
        Generate SELECT * query template.
        
        Parameters:
        - database_id: int, database identifier  
        - schema: str, schema name
        - table: str, table name
        
        HTTP Method: POST
        
        Returns:
        JSON response with generated SELECT query
        """
    
    def queries(self, last_updated_ms):
        """
        Get recent queries for current user.
        
        Parameters:
        - last_updated_ms: int, timestamp filter for incremental updates
        
        Returns:
        JSON array of recent query objects with status and metadata
        """
    
    def sql_json(self):
        """
        Execute SQL and return JSON results.
        Asynchronous execution endpoint for complex queries.
        
        HTTP Method: POST
        
        Request Body:
        - database_id: int, target database
        - sql: str, SQL query text
        - async: bool, asynchronous execution flag
        - run_async: bool, force async execution
        
        Returns:
        JSON response with query_id for async tracking or immediate results
        """
    
    def csv(self, query_id):
        """
        Export query results as CSV.
        
        Parameters:
        - query_id: str, query identifier for result export
        
        Returns:
        CSV file download response with query results
        """
    
    def results(self, query_id):
        """
        Get asynchronous query results.
        
        Parameters:
        - query_id: str, query identifier to retrieve results for
        
        Returns:  
        JSON response with query results, status, and metadata
        """
    
    def stop_query(self):
        """
        Stop running query execution.
        
        HTTP Method: POST
        
        Request Body:
        - client_id: str, client session identifier
        
        Returns:
        JSON confirmation of query cancellation
        """
    
    def copy_query(self, query_id):
        """
        Copy existing query to new SQL Lab tab.
        
        Parameters:
        - query_id: str, source query identifier
        
        HTTP Method: POST
        
        Returns:
        JSON response with copied query details
        """

Query Status Management

Query lifecycle state tracking and status monitoring system.

class QueryStatus:
    """
    Query execution status enumeration.
    Defines all possible states during query lifecycle.
    """
    
    STOPPED = 'stopped'
    """Query execution was stopped by user or system."""
    
    FAILED = 'failed'
    """Query execution failed due to error."""
    
    PENDING = 'pending' 
    """Query is queued for execution."""
    
    RUNNING = 'running'
    """Query is currently executing."""
    
    SCHEDULED = 'scheduled'
    """Query is scheduled for future execution."""
    
    SUCCESS = 'success'
    """Query completed successfully."""
    
    TIMED_OUT = 'timed_out'
    """Query exceeded maximum execution time."""

SQL Parsing and Validation

Query analysis and security validation functionality.

class SupersetQuery:
    """
    SQL query analysis and validation utilities.
    Provides parsing capabilities for security and feature detection.
    
    Properties:
    - tables: set, extracted table names from query
    - limit: int, query LIMIT clause value if present
    """
    
    def is_select(self):
        """
        Check if query is a SELECT statement.
        
        Returns:
        bool, True for SELECT queries, False for DDL/DML
        """
    
    def is_explain(self):
        """
        Check if query is an EXPLAIN statement.
        
        Returns:
        bool, True for EXPLAIN queries
        """
    
    def is_readonly(self):
        """
        Check if query is read-only.
        Validates that query doesn't modify data or schema.
        
        Returns:
        bool, True for read-only queries (SELECT, EXPLAIN, SHOW, etc.)
        """
    
    def stripped(self):
        """
        Get SQL text with comments and whitespace removed.
        
        Returns:
        str, cleaned SQL text for analysis
        """

def extract_tables_from_sql(sql):
    """
    Extract table references from SQL query.
    
    Parameters:
    - sql: str, SQL query text to analyze
    
    Returns:
    set of table names referenced in the query
    """

def validate_sql(sql):
    """
    Basic SQL validation and syntax checking.
    
    Parameters:
    - sql: str, SQL query to validate
    
    Returns:
    bool, True if SQL appears syntactically valid
    
    Raises:
    SupersetException for invalid SQL syntax
    """

Result Management

Query result storage, caching, and export functionality.

# Result Storage Configuration
RESULTS_BACKEND = {
    'cache_type': 'redis',
    'cache_key_prefix': 'superset_results',
    'cache_timeout': 3600
}
"""
Configurable backend for storing large query results.
Supports Redis, S3, and custom storage implementations.
"""

# CSV Export Configuration  
CSV_EXPORT = {
    'encoding': 'utf-8',
    'delimiter': ',',
    'line_terminator': '\n',
    'quote_char': '"',
    'escape_char': '\\',
    'max_rows': 100000
}
"""
CSV export format and limits configuration.
"""

# Query Result Caching
QUERY_CACHE_TIMEOUT = 3600
"""
Default timeout for caching query results in seconds.
"""

# Async Query Configuration
SQLLAB_ASYNC_TIME_LIMIT_SEC = 21600  # 6 hours
"""
Maximum execution time for asynchronous queries.
"""

SQL_MAX_ROW = 100000
"""
Maximum number of rows for async SQL results.
"""

Usage Examples

Synchronous Query Execution

# Execute small query immediately
response = requests.post('/superset/runsql/', json={
    'database_id': 1,
    'sql': 'SELECT COUNT(*) FROM users',
    'schema': 'public'
})

result = response.json()
if result['status'] == 'success':
    data = result['data']

Asynchronous Query Execution

# Start long-running query
response = requests.post('/superset/sql_json/', json={
    'database_id': 1,  
    'sql': 'SELECT * FROM large_table',
    'async': True,
    'run_async': True
})

query_id = response.json()['query']['queryId']

# Check query status
status_response = requests.get(f'/superset/results/{query_id}')
status = status_response.json()

if status['status'] == 'success':
    results = status['data']
elif status['status'] == 'running':
    # Query still executing, check again later
    pass

Query Result Export

# Export results as CSV
csv_response = requests.get(f'/superset/csv/{query_id}')
with open('results.csv', 'wb') as f:
    f.write(csv_response.content)

Database Metadata Browsing

# Get available schemas
schemas = requests.get('/superset/schemas/', params={
    'database_id': 1
}).json()

# Get tables in schema
tables = requests.get(f'/superset/tables/1/public/').json()

# Get table metadata
table_info = requests.get('/superset/table/1/public/users').json()
columns = table_info['columns']

Configuration

SQL Lab Settings

# Enable SQL Lab
SQLLAB_ASYNC_TIME_LIMIT_SEC = 21600

# Default database for SQL Lab
SQLLAB_DEFAULT_DBID = 1

# Synchronous query timeout  
SQLLAB_TIMEOUT = 30

# Maximum rows for display
DISPLAY_MAX_ROW = 1000

# Query search limit
QUERY_SEARCH_LIMIT = 1000

Security Configuration

# SQL Lab permissions
SQLLAB_PERMISSIONS = [
    'can_sql_json',
    'can_csv', 
    'can_search_queries'
]

# Allow DML operations
PREVENT_UNSAFE_DB_CONNECTIONS = True

# Query validation rules
SQLLAB_CTAS_NO_LIMIT = True

Integration

SQL Lab integrates with:

Celery: Asynchronous query processing and task management
Redis/Database: Result caching and temporary storage
Security Manager: Permission-based database and table access
Database Connectors: Multi-engine SQL execution support
Frontend: React-based SQL editor with syntax highlighting
Export Systems: CSV, JSON, and custom result format support

The SQL Lab provides a comprehensive SQL development environment enabling data analysts and engineers to explore datasets, develop queries, and collaborate on data analysis within the Superset ecosystem.

Install with Tessl CLI