Ctrl + k

or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/connectorx@0.4.x
tile.json

tessl/pypi-connectorx

tessl install tessl/pypi-connectorx@0.4.0

Load data from databases to dataframes, the fastest way.

Agent Success

Agent success rate when using this tile

86%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.05x

Baseline

Agent success rate without this tile

82%

task.mdevals/scenario-6/

Database Export Utility

Build a Python utility that connects to a database and exports query results to multiple dataframe formats based on user requirements.

Requirements

Your utility should provide a function export_query_results() that:

  1. Accepts a database connection string, a SQL query, and a desired output format
  2. Supports exporting to at least three different dataframe formats: standard dataframes, columnar format, and high-performance dataframes
  3. For streaming scenarios, supports batch processing with configurable batch sizes
  4. Returns the data in the requested format

Functional Requirements

Export to Standard Dataframe Format

  • Must return a standard pandas-compatible dataframe
  • Should properly handle nullable integer types
  • Must support setting a custom index column when specified

Export to Columnar Format

  • Must return data in a columnar table format suitable for analytics pipelines
  • Should enable efficient zero-copy operations with other analytics tools

Export to High-Performance Dataframe Format

  • Must return a high-performance dataframe optimized for large datasets
  • Should leverage efficient memory representation

Batch Processing Support

  • For large datasets, must support streaming batch processing
  • Should allow configurable batch size
  • Must return an iterator that yields data in batches

Implementation

@generates

API

def export_query_results(
    connection_string: str,
    query: str,
    output_format: str,
    index_col: str | None = None,
    batch_size: int | None = None
) -> Any:
    """
    Execute a SQL query and export results to the specified format.

    Args:
        connection_string: Database connection string (e.g., 'sqlite:///data.db')
        query: SQL query to execute
        output_format: Desired output format ('pandas', 'arrow', 'polars', 'arrow_stream')
        index_col: Optional column name to use as index (pandas only)
        batch_size: Optional batch size for streaming (arrow_stream only)

    Returns:
        Query results in the requested format:
        - 'pandas': pandas DataFrame
        - 'arrow': pyarrow Table
        - 'polars': polars DataFrame
        - 'arrow_stream': pyarrow RecordBatchReader

    Raises:
        ValueError: If output_format is not supported
        ValueError: If index_col is specified for non-pandas formats
        ValueError: If batch_size is specified for non-streaming formats
    """
    pass

Test Cases

Basic Export Tests

  • Given a SQLite database with a simple table, exporting with output_format='pandas' returns a pandas DataFrame with correct data and dtypes @test

  • Given a SQLite database with a table containing integers, exporting with output_format='arrow' returns a pyarrow Table @test

  • Given a SQLite database with a table, exporting with output_format='polars' returns a polars DataFrame @test

Format-Specific Features

  • Given a SQLite database with a table, exporting with output_format='pandas' and index_col='id' returns a pandas DataFrame with 'id' as the index @test

  • Given a SQLite database with a table, exporting with output_format='arrow_stream' and batch_size=5 returns a RecordBatchReader that yields batches of size 5 @test

Error Handling

  • Attempting to export with an unsupported output_format raises ValueError @test

  • Attempting to specify index_col with output_format='arrow' raises ValueError @test

Dependencies { .dependencies }

connectorx { .dependency }

Provides high-performance database-to-dataframe data loading with support for multiple output formats.

@satisfied-by