CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-superset

A modern, enterprise-ready business intelligence web application

Pending
Overview
Eval results
Files

utilities.mddocs/

Utilities

Core utilities for data processing, caching, JSON serialization, time handling, database management, and Celery integration. Provides essential functionality used throughout the Superset application for common operations and system integration.

Capabilities

Data Processing Utilities

Core functions for data transformation, serialization, and user interface operations.

def flasher(msg, severity=None):
    """
    Flash message utility for user notifications.
    Integrates with Flask's flash message system for UI feedback.
    
    Parameters:
    - msg: str, message text to display to user
    - severity: str, optional message severity ('info', 'warning', 'error', 'success')
    
    Usage:
    Used throughout the application to provide user feedback
    for operations, errors, and status updates.
    """

def parse_human_datetime(s):
    """
    Parse human-readable datetime strings.
    Supports natural language date expressions and ISO formats.
    
    Parameters:
    - s: str, datetime string in human-readable format
    
    Returns:
    datetime object parsed from input string
    
    Examples:
    - '2023-01-01' -> datetime(2023, 1, 1)
    - 'yesterday' -> datetime for previous day
    - '1 week ago' -> datetime for one week prior
    """

def datetime_f(dttm):
    """
    Format datetime objects for display.
    Standardized datetime formatting for UI consistency.
    
    Parameters:
    - dttm: datetime, datetime object to format
    
    Returns:
    str, formatted datetime string for display
    """

def base_json_conv(obj):
    """
    JSON serialization converter for complex objects.
    Handles datetime, Decimal, and other non-serializable types.
    
    Parameters:
    - obj: any, object to convert for JSON serialization
    
    Returns:
    JSON-serializable representation of object
    
    Usage:
    Used as default converter in json_dumps() for complex data types.
    """

def json_iso_dttm_ser(dttm, pessimistic=False):
    """
    ISO datetime serialization for JSON APIs.
    
    Parameters:
    - dttm: datetime, datetime object to serialize
    - pessimistic: bool, whether to use pessimistic timezone handling
    
    Returns:
    str, ISO 8601 formatted datetime string
    """

def json_int_dttm_ser(dttm):
    """
    Integer timestamp serialization for JavaScript compatibility.
    
    Parameters:
    - dttm: datetime, datetime object to serialize
    
    Returns:
    int, Unix timestamp in milliseconds for JavaScript Date()
    """

def json_dumps(obj, default=None, ignore_nan=False, encoding=None, sort_keys=False):
    """
    Enhanced JSON serialization with Superset-specific handling.
    
    Parameters:
    - obj: any, object to serialize to JSON
    - default: callable, custom serialization function for complex types
    - ignore_nan: bool, whether to ignore NaN values in numeric data
    - encoding: str, character encoding for string data
    - sort_keys: bool, whether to sort dictionary keys in output
    
    Returns:
    str, JSON string representation of object
    
    Features:
    - Handles pandas DataFrames and Series
    - Processes datetime objects with timezone awareness
    - Manages NaN and infinity values appropriately
    - Supports custom serialization handlers
    """

Database Utilities

Database connection management and configuration functions.

def pessimistic_connection_handling(engine):
    """
    Configure pessimistic disconnect handling for database connections.
    Improves connection reliability in unstable network environments.
    
    Parameters:
    - engine: SQLAlchemy Engine, database engine to configure
    
    Side Effects:
    Configures engine event listeners for connection validation
    and automatic reconnection on disconnect detection.
    """

def setup_cache(app, cache_config):
    """
    Initialize application cache configuration.
    Sets up Flask-Caching with specified backend and options.
    
    Parameters:
    - app: Flask application instance
    - cache_config: dict, cache configuration parameters
    
    Returns:
    Cache instance configured for the application
    
    Supported Backends:
    - Redis: High-performance distributed caching
    - Memcached: Memory-based caching system
    - Simple: In-memory Python dictionary cache
    - FileSystem: File-based cache storage
    """

def get_or_create_main_db():
    """
    Get or create main database connection instance.
    Ensures Superset has a configured main database for metadata storage.
    
    Returns:
    Database instance for Superset's main metadata database
    
    Usage:
    Called during application initialization to establish
    the primary database connection for application metadata.
    """

def get_main_database(session):
    """
    Retrieve main database instance from session.
    
    Parameters:
    - session: SQLAlchemy session for database operations
    
    Returns:
    Database instance representing the main Superset database
    """

def get_update_perms_flag():
    """
    Get permission update flag from configuration.
    Controls whether permissions are automatically updated during startup.
    
    Returns:
    bool, True if permissions should be updated automatically
    """

Query Processing Utilities

Functions for processing and manipulating query parameters and filters.

def merge_extra_filters(form_data, extra_filters):
    """
    Merge additional filters into form data.
    Combines dashboard-level filters with chart-specific filters.
    
    Parameters:
    - form_data: dict, chart configuration and existing filters
    - extra_filters: list, additional filters to apply
    
    Returns:
    dict, updated form data with merged filters
    
    Usage:
    Used when dashboard filters need to be applied to individual charts
    for consistent filtering across dashboard components.
    """

def merge_request_params(form_data, params):
    """
    Merge HTTP request parameters into form data.
    Incorporates URL parameters and form submissions into chart configuration.
    
    Parameters:
    - form_data: dict, existing chart configuration
    - params: dict, HTTP request parameters to merge
    
    Returns:
    dict, updated form data with request parameters
    """

def get_since_until(time_range=None, since=None, until=None, time_shift=None, relative_start=None, relative_end=None):
    """
    Parse and process time range parameters for queries.
    Handles various time range specifications and converts to absolute timestamps.
    
    Parameters:
    - time_range: str, natural language time range ('Last week', '30 days ago', etc.)
    - since: str, start time specification
    - until: str, end time specification
    - time_shift: str, time shift offset for comparisons
    - relative_start: str, relative start time specification
    - relative_end: str, relative end time specification
    
    Returns:
    tuple, (since_datetime, until_datetime) with processed time boundaries
    
    Features:
    - Natural language time range parsing
    - Relative time calculations
    - Time zone handling and conversion
    - Support for rolling time windows
    """

def add_ago_to_kwargs(kwargs, time_ago):
    """
    Add time offset to query parameters for temporal comparisons.
    
    Parameters:
    - kwargs: dict, query parameters to modify
    - time_ago: str, time offset specification ('1 week ago', '30 days', etc.)
    
    Returns:
    dict, modified parameters with time offset applied
    
    Usage:
    Used for period-over-period comparisons and trend analysis
    where historical data needs to be queried with time shifts.
    """

Security and Validation Utilities

Functions for data validation, compression, and security operations.

def zlib_compress(data):
    """
    Compress data using zlib compression algorithm.
    
    Parameters:
    - data: bytes or str, data to compress
    
    Returns:
    bytes, compressed data suitable for storage or transmission
    
    Usage:
    Used for compressing large query results and cached data
    to reduce storage requirements and network bandwidth.
    """

def zlib_decompress(data):
    """
    Decompress zlib-compressed data.
    
    Parameters:
    - data: bytes, compressed data to decompress
    
    Returns:
    bytes, original uncompressed data
    
    Usage:
    Companion function to zlib_compress() for retrieving
    compressed cached data and query results.
    """

def validate_json(obj):
    """
    Validate JSON structure and content.
    
    Parameters:
    - obj: any, object to validate for JSON compliance
    
    Returns:
    bool, True if object is valid JSON, False otherwise
    
    Raises:
    ValueError for invalid JSON structures
    
    Usage:
    Used throughout the application to validate configuration
    parameters, API inputs, and stored JSON data.
    """

Caching Utilities

Memoization and caching functionality for performance optimization.

class memoized:
    """
    Memoization decorator for function result caching.
    
    Properties:
    - watch: list, instance variables to monitor for cache invalidation
    
    Usage:
    Decorator that caches function results based on arguments.
    Automatically invalidates cache when watched instance variables change.
    
    Example:
    @memoized
    def expensive_calculation(self, param1, param2):
        return complex_computation(param1, param2)
    
    @memoized(watch=('config', 'settings'))  
    def config_dependent_function(self):
        return process_configuration(self.config)
    """

Time and Date Utilities

Constants and functions for time-based operations and calculations.

def now_as_float():
    """
    Get current timestamp as floating point number.
    
    Returns:
    float, current time as Unix timestamp with millisecond precision
    
    Usage:
    Used for performance timing, cache key generation,
    and high-precision timestamp requirements.
    """

DTTM_ALIAS: str = '__timestamp'
"""
Standard alias for datetime columns in queries.
Consistent column name used across visualizations for time-based data.
"""

EPOCH: datetime
"""
Unix epoch datetime object (1970-01-01 00:00:00 UTC).
Reference point for timestamp calculations and conversions.
"""

JS_MAX_INTEGER: int = 9007199254740991  # 2^53-1
"""
Maximum safe integer value for JavaScript compatibility.
Used to prevent precision loss when sending large integers to frontend.
"""

Data Types and Extensions

Custom SQLAlchemy types and database-specific extensions.

class MediumText:
    """
    Extended text column type for MySQL databases.
    Provides larger text storage capacity than standard TEXT type.
    
    Features:
    - Supports up to 16MB of text data
    - MySQL-specific optimization
    - Automatic fallback for other database engines
    """

# Custom SQLAlchemy Types
"""
Various custom column types for specialized data storage:
- JSON columns for configuration data
- Encrypted columns for sensitive information
- Compressed columns for large text data
- Custom numeric types for specialized calculations
"""

Celery Integration

Celery application management for asynchronous task processing.

def get_celery_app(config):
    """
    Get or create Celery application instance.
    
    Parameters:
    - config: dict or object, Celery configuration parameters
    
    Returns:
    Celery application instance configured for Superset tasks
    
    Features:
    - Automatic configuration from Superset settings
    - Task routing and queue management
    - Result backend configuration
    - Worker process management
    
    Usage:
    Used to initialize Celery for asynchronous query processing,
    email notifications, and background task execution.
    """

Query Status and Enumerations

Status tracking and enumeration constants for query lifecycle management.

class QueryStatus:
    """
    Query execution status enumeration.
    Defines standardized status values for tracking query lifecycle.
    """
    
    STOPPED = 'stopped'
    """Query execution was manually stopped or cancelled."""
    
    FAILED = 'failed'
    """Query execution failed due to error or exception."""
    
    PENDING = 'pending'
    """Query is queued and waiting for execution."""
    
    RUNNING = 'running'
    """Query is currently executing on database."""
    
    SCHEDULED = 'scheduled'
    """Query is scheduled for future execution."""
    
    SUCCESS = 'success'
    """Query completed successfully with results available."""
    
    TIMED_OUT = 'timed_out'
    """Query exceeded maximum allowed execution time."""

Adhoc Metrics

Dynamic metric creation and processing utilities.

ADHOC_METRIC_EXPRESSION_TYPES = {
    'SIMPLE': 'SIMPLE',
    'SQL': 'SQL'
}
"""
Adhoc metric expression type constants.

- SIMPLE: Basic aggregation functions (SUM, AVG, COUNT, etc.)
- SQL: Custom SQL expressions for complex calculations
"""

def to_adhoc(fds, metric, label=None):
    """
    Convert metric definition to adhoc metric format.
    
    Parameters:
    - fds: dict, form data structure containing metric context
    - metric: str or dict, metric name or definition to convert
    - label: str, optional custom label for the metric
    
    Returns:
    dict, adhoc metric definition suitable for query processing
    
    Usage:
    Used to standardize metric definitions from various sources
    into a consistent format for query generation and visualization.
    """

Usage Examples

Data Processing

# Parse natural language dates
start_date = parse_human_datetime('30 days ago')
end_date = parse_human_datetime('today')

# Format for display
formatted_date = datetime_f(start_date)

# JSON serialization with complex types
data = {
    'timestamp': datetime.now(),
    'values': [1.5, 2.7, float('nan')],
    'metadata': {'source': 'database'}
}
json_string = json_dumps(data, ignore_nan=True)

Caching and Memoization

class DataProcessor:
    def __init__(self):
        self.config = {}
    
    @memoized(watch=['config'])
    def process_data(self, dataset_id):
        """Expensive data processing with configuration dependency."""
        return expensive_calculation(dataset_id, self.config)
    
    @memoized
    def get_metadata(self, table_name):
        """Cached metadata retrieval."""
        return fetch_table_metadata(table_name)

Query Processing

# Merge dashboard filters with chart filters
chart_data = merge_extra_filters(
    form_data={'metrics': ['count'], 'groupby': ['category']},
    extra_filters=[{'col': 'status', 'op': '==', 'val': 'active'}]
)

# Process time range parameters
since, until = get_since_until(
    time_range='Last 30 days',
    time_shift='1 week ago'
)

Database Operations

# Setup application cache
app = Flask(__name__)
cache = setup_cache(app, {
    'CACHE_TYPE': 'redis',
    'CACHE_REDIS_URL': 'redis://localhost:6379/0'
})

# Configure database connection
engine = create_engine(database_url)
pessimistic_connection_handling(engine)

Celery Task Management

# Initialize Celery application
celery_config = {
    'broker_url': 'redis://localhost:6379/0',
    'result_backend': 'redis://localhost:6379/0'
}
celery_app = get_celery_app(celery_config)

# Define async task
@celery_app.task
def process_large_query(query_id):
    return execute_sql_query(query_id)

The utilities module provides essential functionality that supports all aspects of Superset operation, from data processing and caching to security and asynchronous task management, enabling robust and performant data visualization and exploration capabilities.

Install with Tessl CLI

npx tessl i tessl/pypi-superset

docs

cli-interface.md

configuration.md

data-models.md

database-connectors.md

index.md

security.md

sql-lab.md

utilities.md

visualization.md

web-application.md

tile.json