0
# Core Query Functions
1
2
Direct SQL execution functions that form the foundation of chDB's query capabilities. These functions provide immediate SQL execution with various output formats and minimal setup requirements.
3
4
## Capabilities
5
6
### Main Query Function
7
8
Executes SQL queries with configurable output formats, supporting both file-based queries and in-memory operations.
9
10
```python { .api }
11
def query(sql: str, output_format: str = "CSV", path: str = "", udf_path: str = ""):
12
"""
13
Execute SQL query with specified output format.
14
15
Parameters:
16
- sql: SQL query string to execute
17
- output_format: Output format ("CSV", "JSON", "DataFrame", "ArrowTable", "Arrow", "Pretty", etc.)
18
- path: Optional database path for stateful operations
19
- udf_path: Optional path to user-defined function configurations
20
21
Returns:
22
Query result in specified format, or formatted string for text formats
23
24
Raises:
25
ChdbError: If query execution fails or has syntax errors
26
"""
27
```
28
29
### SQL Alias Function
30
31
Convenience alias for the main query function with identical functionality.
32
33
```python { .api }
34
def sql(sql: str, output_format: str = "CSV", path: str = "", udf_path: str = ""):
35
"""
36
Alias for query() function with identical parameters and behavior.
37
"""
38
```
39
40
### Result Conversion Functions
41
42
Convert query results between different data formats for flexible data processing workflows.
43
44
```python { .api }
45
def to_df(result):
46
"""
47
Convert query result to pandas DataFrame.
48
49
Parameters:
50
- result: Query result object from chdb.query()
51
52
Returns:
53
pandas.DataFrame: Converted DataFrame
54
55
Raises:
56
ImportError: If pandas or pyarrow not installed
57
"""
58
59
def to_arrowTable(result):
60
"""
61
Convert query result to PyArrow Table.
62
63
Parameters:
64
- result: Query result object from chdb.query()
65
66
Returns:
67
pyarrow.Table: Converted Arrow Table
68
69
Raises:
70
ImportError: If pyarrow not installed
71
"""
72
```
73
74
## Usage Examples
75
76
### Basic Queries
77
78
```python
79
import chdb
80
81
# Simple query with default CSV output
82
result = chdb.query("SELECT 1 as id, 'hello' as message")
83
print(result) # Outputs CSV format
84
85
# JSON output
86
json_result = chdb.query("SELECT 1 as id, 'hello' as message", "JSON")
87
print(json_result)
88
89
# Pretty formatted output
90
pretty_result = chdb.query("SELECT version()", "Pretty")
91
print(pretty_result)
92
```
93
94
### File-based Queries
95
96
```python
97
# Query different file formats
98
parquet_data = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'DataFrame')
99
csv_data = chdb.query('SELECT * FROM file("data.csv", CSV)', 'JSON')
100
json_data = chdb.query('SELECT * FROM file("data.json", JSONEachRow)', 'DataFrame')
101
102
# Complex analytical queries
103
result = chdb.query('''
104
SELECT
105
category,
106
COUNT(*) as count,
107
AVG(price) as avg_price
108
FROM file("sales.parquet", Parquet)
109
GROUP BY category
110
ORDER BY count DESC
111
''', 'DataFrame')
112
```
113
114
### Working with DataFrames
115
116
```python
117
import pandas as pd
118
119
# Get DataFrame directly
120
df_result = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'DataFrame')
121
122
# Convert result to DataFrame manually
123
csv_result = chdb.query('SELECT * FROM file("data.csv", CSV)', 'Arrow')
124
df = chdb.to_df(csv_result)
125
126
# Get PyArrow Table
127
arrow_result = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'ArrowTable')
128
```
129
130
### Using SQL Alias
131
132
```python
133
# sql() function works identically to query()
134
result = chdb.sql("SELECT COUNT(*) FROM file('data.parquet', Parquet)")
135
df_result = chdb.sql("SELECT * FROM file('data.csv', CSV)", "DataFrame")
136
```
137
138
### Error Handling
139
140
```python
141
from chdb import ChdbError
142
143
try:
144
result = chdb.query("SELECT * FROM nonexistent_table")
145
except ChdbError as e:
146
print(f"Query failed: {e}")
147
```
148
149
## Supported Output Formats
150
151
- **CSV**: Comma-separated values (default)
152
- **JSON**: JSON format with records
153
- **JSONEachRow**: JSON with one object per line
154
- **DataFrame**: pandas DataFrame (requires pandas and pyarrow)
155
- **ArrowTable**: PyArrow Table (requires pyarrow)
156
- **Arrow**: Arrow format bytes
157
- **Pretty**: Human-readable formatted output
158
- **TabSeparated**: Tab-separated values
159
- **Parquet**: Parquet format bytes
160
- **ORC**: ORC format bytes
161
- And 50+ more formats supported by ClickHouse
162
163
## Supported Input Formats
164
165
Files can be queried using the `file()` function with format specification:
166
167
- **Parquet**: `file("data.parquet", Parquet)`
168
- **CSV**: `file("data.csv", CSV)`
169
- **JSON**: `file("data.json", JSONEachRow)`
170
- **Arrow**: `file("data.arrow", Arrow)`
171
- **ORC**: `file("data.orc", ORC)`
172
- **TSV**: `file("data.tsv", TabSeparated)`
173
- And 60+ more formats supported by ClickHouse