chDB is an in-process SQL OLAP Engine powered by ClickHouse
—
Quality
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Direct SQL execution functions that form the foundation of chDB's query capabilities. These functions provide immediate SQL execution with various output formats and minimal setup requirements.
Executes SQL queries with configurable output formats, supporting both file-based queries and in-memory operations.
def query(sql: str, output_format: str = "CSV", path: str = "", udf_path: str = ""):
"""
Execute SQL query with specified output format.
Parameters:
- sql: SQL query string to execute
- output_format: Output format ("CSV", "JSON", "DataFrame", "ArrowTable", "Arrow", "Pretty", etc.)
- path: Optional database path for stateful operations
- udf_path: Optional path to user-defined function configurations
Returns:
Query result in specified format, or formatted string for text formats
Raises:
ChdbError: If query execution fails or has syntax errors
"""Convenience alias for the main query function with identical functionality.
def sql(sql: str, output_format: str = "CSV", path: str = "", udf_path: str = ""):
"""
Alias for query() function with identical parameters and behavior.
"""Convert query results between different data formats for flexible data processing workflows.
def to_df(result):
"""
Convert query result to pandas DataFrame.
Parameters:
- result: Query result object from chdb.query()
Returns:
pandas.DataFrame: Converted DataFrame
Raises:
ImportError: If pandas or pyarrow not installed
"""
def to_arrowTable(result):
"""
Convert query result to PyArrow Table.
Parameters:
- result: Query result object from chdb.query()
Returns:
pyarrow.Table: Converted Arrow Table
Raises:
ImportError: If pyarrow not installed
"""import chdb
# Simple query with default CSV output
result = chdb.query("SELECT 1 as id, 'hello' as message")
print(result) # Outputs CSV format
# JSON output
json_result = chdb.query("SELECT 1 as id, 'hello' as message", "JSON")
print(json_result)
# Pretty formatted output
pretty_result = chdb.query("SELECT version()", "Pretty")
print(pretty_result)# Query different file formats
parquet_data = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'DataFrame')
csv_data = chdb.query('SELECT * FROM file("data.csv", CSV)', 'JSON')
json_data = chdb.query('SELECT * FROM file("data.json", JSONEachRow)', 'DataFrame')
# Complex analytical queries
result = chdb.query('''
SELECT
category,
COUNT(*) as count,
AVG(price) as avg_price
FROM file("sales.parquet", Parquet)
GROUP BY category
ORDER BY count DESC
''', 'DataFrame')import pandas as pd
# Get DataFrame directly
df_result = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'DataFrame')
# Convert result to DataFrame manually
csv_result = chdb.query('SELECT * FROM file("data.csv", CSV)', 'Arrow')
df = chdb.to_df(csv_result)
# Get PyArrow Table
arrow_result = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'ArrowTable')# sql() function works identically to query()
result = chdb.sql("SELECT COUNT(*) FROM file('data.parquet', Parquet)")
df_result = chdb.sql("SELECT * FROM file('data.csv', CSV)", "DataFrame")from chdb import ChdbError
try:
result = chdb.query("SELECT * FROM nonexistent_table")
except ChdbError as e:
print(f"Query failed: {e}")Files can be queried using the file() function with format specification:
file("data.parquet", Parquet)file("data.csv", CSV)file("data.json", JSONEachRow)file("data.arrow", Arrow)file("data.orc", ORC)file("data.tsv", TabSeparated)Install with Tessl CLI
npx tessl i tessl/pypi-chdb