A modern, enterprise-ready business intelligence web application
—
Interactive SQL editor with syntax highlighting, auto-completion, query history, and asynchronous execution capabilities. Provides comprehensive SQL development environment with result management, saved queries, and collaborative features for data exploration and analysis.
Core SQL execution engine with synchronous and asynchronous processing capabilities.
def get_sql_results(ctask, query_id, rendered_query, return_results=True, store_results=False, user_name=None, start_time=None):
"""
Celery task for asynchronous SQL execution.
Main entry point for background query processing with comprehensive
error handling, progress tracking, and result storage.
Parameters:
- ctask: Celery task instance for progress updates
- query_id: str, unique identifier for query tracking
- rendered_query: str, final SQL text after template processing
- return_results: bool, whether to return data in response
- store_results: bool, whether to persist results for large datasets
- user_name: str, optional username for audit logging
- start_time: datetime, query initiation timestamp
Returns:
dict with query results, metadata, and execution statistics
"""
def execute_sql(ctask, query_id, rendered_query, return_results, store_results, user_name, session, start_time):
"""
Core SQL execution logic with database interaction.
Handles query parsing, security validation, execution,
and result processing with comprehensive error handling.
Parameters:
- ctask: Celery task instance
- query_id: str, query identifier
- rendered_query: str, executable SQL statement
- return_results: bool, include data in response
- store_results: bool, persist results to backend storage
- user_name: str, executing user identification
- session: SQLAlchemy session for database operations
- start_time: datetime, execution start timestamp
Returns:
dict with execution results, query metadata, and performance metrics
"""
def has_table_query(sql):
"""
Check if query creates or modifies tables.
Analyzes SQL for DDL operations requiring special permissions.
Parameters:
- sql: str, SQL query text to analyze
Returns:
bool, True if query contains table creation/modification operations
"""
def get_query(query_id, session, retry_count=5):
"""
Retrieve query object with retry logic.
Handles concurrent access and temporary database unavailability.
Parameters:
- query_id: str, query identifier to retrieve
- session: SQLAlchemy session for database access
- retry_count: int, maximum retry attempts for retrieval
Returns:
Query object or None if not found after retries
"""
def session_scope(nullpool):
"""
Context manager for database session management.
Ensures proper session cleanup and transaction handling.
Parameters:
- nullpool: bool, whether to use null connection pooling
Yields:
SQLAlchemy session configured for query operations
"""Web interface endpoints for SQL Lab functionality and user interaction.
class SqlLabView:
"""
SQL Lab web interface controller.
Provides REST API endpoints for SQL development environment
including query execution, result management, and metadata browsing.
"""
def index(self):
"""
SQL Lab main interface.
Returns:
HTML interface for SQL Lab with editor, results, and navigation
"""
def format_sql(self):
"""
Format SQL query text.
Applies standard formatting rules for improved readability.
HTTP Method: POST
Request Body:
- sql: str, SQL text to format
Returns:
JSON response with formatted SQL text
"""
def runsql(self):
"""
Execute SQL query synchronously.
Immediate execution for small queries with direct result return.
HTTP Method: POST
Request Body:
- database_id: int, target database identifier
- sql: str, SQL query text to execute
- schema: str, optional schema context
- tmp_table_name: str, optional temporary table name
- select_as_cta: bool, CREATE TABLE AS SELECT flag
- client_id: str, client session identifier
Returns:
JSON response with query results and execution metadata
"""
def schemas(self):
"""
List database schemas.
HTTP Method: GET
Query Parameters:
- database_id: int, database identifier
Returns:
JSON array of available schema names
"""
def tables(self, database_id, schema):
"""
List schema tables.
Parameters:
- database_id: int, database identifier
- schema: str, schema name to explore
Returns:
JSON array of table names in specified schema
"""
def table(self, database_id, schema, table):
"""
Get table metadata and column information.
Parameters:
- database_id: int, database identifier
- schema: str, schema name
- table: str, table name
Returns:
JSON object with table metadata, columns, and constraints
"""
def select_star(self, database_id, schema, table):
"""
Generate SELECT * query template.
Parameters:
- database_id: int, database identifier
- schema: str, schema name
- table: str, table name
HTTP Method: POST
Returns:
JSON response with generated SELECT query
"""
def queries(self, last_updated_ms):
"""
Get recent queries for current user.
Parameters:
- last_updated_ms: int, timestamp filter for incremental updates
Returns:
JSON array of recent query objects with status and metadata
"""
def sql_json(self):
"""
Execute SQL and return JSON results.
Asynchronous execution endpoint for complex queries.
HTTP Method: POST
Request Body:
- database_id: int, target database
- sql: str, SQL query text
- async: bool, asynchronous execution flag
- run_async: bool, force async execution
Returns:
JSON response with query_id for async tracking or immediate results
"""
def csv(self, query_id):
"""
Export query results as CSV.
Parameters:
- query_id: str, query identifier for result export
Returns:
CSV file download response with query results
"""
def results(self, query_id):
"""
Get asynchronous query results.
Parameters:
- query_id: str, query identifier to retrieve results for
Returns:
JSON response with query results, status, and metadata
"""
def stop_query(self):
"""
Stop running query execution.
HTTP Method: POST
Request Body:
- client_id: str, client session identifier
Returns:
JSON confirmation of query cancellation
"""
def copy_query(self, query_id):
"""
Copy existing query to new SQL Lab tab.
Parameters:
- query_id: str, source query identifier
HTTP Method: POST
Returns:
JSON response with copied query details
"""Query lifecycle state tracking and status monitoring system.
class QueryStatus:
"""
Query execution status enumeration.
Defines all possible states during query lifecycle.
"""
STOPPED = 'stopped'
"""Query execution was stopped by user or system."""
FAILED = 'failed'
"""Query execution failed due to error."""
PENDING = 'pending'
"""Query is queued for execution."""
RUNNING = 'running'
"""Query is currently executing."""
SCHEDULED = 'scheduled'
"""Query is scheduled for future execution."""
SUCCESS = 'success'
"""Query completed successfully."""
TIMED_OUT = 'timed_out'
"""Query exceeded maximum execution time."""Query analysis and security validation functionality.
class SupersetQuery:
"""
SQL query analysis and validation utilities.
Provides parsing capabilities for security and feature detection.
Properties:
- tables: set, extracted table names from query
- limit: int, query LIMIT clause value if present
"""
def is_select(self):
"""
Check if query is a SELECT statement.
Returns:
bool, True for SELECT queries, False for DDL/DML
"""
def is_explain(self):
"""
Check if query is an EXPLAIN statement.
Returns:
bool, True for EXPLAIN queries
"""
def is_readonly(self):
"""
Check if query is read-only.
Validates that query doesn't modify data or schema.
Returns:
bool, True for read-only queries (SELECT, EXPLAIN, SHOW, etc.)
"""
def stripped(self):
"""
Get SQL text with comments and whitespace removed.
Returns:
str, cleaned SQL text for analysis
"""
def extract_tables_from_sql(sql):
"""
Extract table references from SQL query.
Parameters:
- sql: str, SQL query text to analyze
Returns:
set of table names referenced in the query
"""
def validate_sql(sql):
"""
Basic SQL validation and syntax checking.
Parameters:
- sql: str, SQL query to validate
Returns:
bool, True if SQL appears syntactically valid
Raises:
SupersetException for invalid SQL syntax
"""Query result storage, caching, and export functionality.
# Result Storage Configuration
RESULTS_BACKEND = {
'cache_type': 'redis',
'cache_key_prefix': 'superset_results',
'cache_timeout': 3600
}
"""
Configurable backend for storing large query results.
Supports Redis, S3, and custom storage implementations.
"""
# CSV Export Configuration
CSV_EXPORT = {
'encoding': 'utf-8',
'delimiter': ',',
'line_terminator': '\n',
'quote_char': '"',
'escape_char': '\\',
'max_rows': 100000
}
"""
CSV export format and limits configuration.
"""
# Query Result Caching
QUERY_CACHE_TIMEOUT = 3600
"""
Default timeout for caching query results in seconds.
"""
# Async Query Configuration
SQLLAB_ASYNC_TIME_LIMIT_SEC = 21600 # 6 hours
"""
Maximum execution time for asynchronous queries.
"""
SQL_MAX_ROW = 100000
"""
Maximum number of rows for async SQL results.
"""# Execute small query immediately
response = requests.post('/superset/runsql/', json={
'database_id': 1,
'sql': 'SELECT COUNT(*) FROM users',
'schema': 'public'
})
result = response.json()
if result['status'] == 'success':
data = result['data']# Start long-running query
response = requests.post('/superset/sql_json/', json={
'database_id': 1,
'sql': 'SELECT * FROM large_table',
'async': True,
'run_async': True
})
query_id = response.json()['query']['queryId']
# Check query status
status_response = requests.get(f'/superset/results/{query_id}')
status = status_response.json()
if status['status'] == 'success':
results = status['data']
elif status['status'] == 'running':
# Query still executing, check again later
pass# Export results as CSV
csv_response = requests.get(f'/superset/csv/{query_id}')
with open('results.csv', 'wb') as f:
f.write(csv_response.content)# Get available schemas
schemas = requests.get('/superset/schemas/', params={
'database_id': 1
}).json()
# Get tables in schema
tables = requests.get(f'/superset/tables/1/public/').json()
# Get table metadata
table_info = requests.get('/superset/table/1/public/users').json()
columns = table_info['columns']# Enable SQL Lab
SQLLAB_ASYNC_TIME_LIMIT_SEC = 21600
# Default database for SQL Lab
SQLLAB_DEFAULT_DBID = 1
# Synchronous query timeout
SQLLAB_TIMEOUT = 30
# Maximum rows for display
DISPLAY_MAX_ROW = 1000
# Query search limit
QUERY_SEARCH_LIMIT = 1000# SQL Lab permissions
SQLLAB_PERMISSIONS = [
'can_sql_json',
'can_csv',
'can_search_queries'
]
# Allow DML operations
PREVENT_UNSAFE_DB_CONNECTIONS = True
# Query validation rules
SQLLAB_CTAS_NO_LIMIT = TrueSQL Lab integrates with:
The SQL Lab provides a comprehensive SQL development environment enabling data analysts and engineers to explore datasets, develop queries, and collaborate on data analysis within the Superset ecosystem.
Install with Tessl CLI
npx tessl i tessl/pypi-superset