or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

dataframe.mddbapi.mdindex.mdquery-functions.mdsessions.mdudf.mdutils.md

query-functions.mddocs/

0

# Core Query Functions

1

2

Direct SQL execution functions that form the foundation of chDB's query capabilities. These functions provide immediate SQL execution with various output formats and minimal setup requirements.

3

4

## Capabilities

5

6

### Main Query Function

7

8

Executes SQL queries with configurable output formats, supporting both file-based queries and in-memory operations.

9

10

```python { .api }

11

def query(sql: str, output_format: str = "CSV", path: str = "", udf_path: str = ""):

12

"""

13

Execute SQL query with specified output format.

14

15

Parameters:

16

- sql: SQL query string to execute

17

- output_format: Output format ("CSV", "JSON", "DataFrame", "ArrowTable", "Arrow", "Pretty", etc.)

18

- path: Optional database path for stateful operations

19

- udf_path: Optional path to user-defined function configurations

20

21

Returns:

22

Query result in specified format, or formatted string for text formats

23

24

Raises:

25

ChdbError: If query execution fails or has syntax errors

26

"""

27

```

28

29

### SQL Alias Function

30

31

Convenience alias for the main query function with identical functionality.

32

33

```python { .api }

34

def sql(sql: str, output_format: str = "CSV", path: str = "", udf_path: str = ""):

35

"""

36

Alias for query() function with identical parameters and behavior.

37

"""

38

```

39

40

### Result Conversion Functions

41

42

Convert query results between different data formats for flexible data processing workflows.

43

44

```python { .api }

45

def to_df(result):

46

"""

47

Convert query result to pandas DataFrame.

48

49

Parameters:

50

- result: Query result object from chdb.query()

51

52

Returns:

53

pandas.DataFrame: Converted DataFrame

54

55

Raises:

56

ImportError: If pandas or pyarrow not installed

57

"""

58

59

def to_arrowTable(result):

60

"""

61

Convert query result to PyArrow Table.

62

63

Parameters:

64

- result: Query result object from chdb.query()

65

66

Returns:

67

pyarrow.Table: Converted Arrow Table

68

69

Raises:

70

ImportError: If pyarrow not installed

71

"""

72

```

73

74

## Usage Examples

75

76

### Basic Queries

77

78

```python

79

import chdb

80

81

# Simple query with default CSV output

82

result = chdb.query("SELECT 1 as id, 'hello' as message")

83

print(result) # Outputs CSV format

84

85

# JSON output

86

json_result = chdb.query("SELECT 1 as id, 'hello' as message", "JSON")

87

print(json_result)

88

89

# Pretty formatted output

90

pretty_result = chdb.query("SELECT version()", "Pretty")

91

print(pretty_result)

92

```

93

94

### File-based Queries

95

96

```python

97

# Query different file formats

98

parquet_data = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'DataFrame')

99

csv_data = chdb.query('SELECT * FROM file("data.csv", CSV)', 'JSON')

100

json_data = chdb.query('SELECT * FROM file("data.json", JSONEachRow)', 'DataFrame')

101

102

# Complex analytical queries

103

result = chdb.query('''

104

SELECT

105

category,

106

COUNT(*) as count,

107

AVG(price) as avg_price

108

FROM file("sales.parquet", Parquet)

109

GROUP BY category

110

ORDER BY count DESC

111

''', 'DataFrame')

112

```

113

114

### Working with DataFrames

115

116

```python

117

import pandas as pd

118

119

# Get DataFrame directly

120

df_result = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'DataFrame')

121

122

# Convert result to DataFrame manually

123

csv_result = chdb.query('SELECT * FROM file("data.csv", CSV)', 'Arrow')

124

df = chdb.to_df(csv_result)

125

126

# Get PyArrow Table

127

arrow_result = chdb.query('SELECT * FROM file("data.parquet", Parquet)', 'ArrowTable')

128

```

129

130

### Using SQL Alias

131

132

```python

133

# sql() function works identically to query()

134

result = chdb.sql("SELECT COUNT(*) FROM file('data.parquet', Parquet)")

135

df_result = chdb.sql("SELECT * FROM file('data.csv', CSV)", "DataFrame")

136

```

137

138

### Error Handling

139

140

```python

141

from chdb import ChdbError

142

143

try:

144

result = chdb.query("SELECT * FROM nonexistent_table")

145

except ChdbError as e:

146

print(f"Query failed: {e}")

147

```

148

149

## Supported Output Formats

150

151

- **CSV**: Comma-separated values (default)

152

- **JSON**: JSON format with records

153

- **JSONEachRow**: JSON with one object per line

154

- **DataFrame**: pandas DataFrame (requires pandas and pyarrow)

155

- **ArrowTable**: PyArrow Table (requires pyarrow)

156

- **Arrow**: Arrow format bytes

157

- **Pretty**: Human-readable formatted output

158

- **TabSeparated**: Tab-separated values

159

- **Parquet**: Parquet format bytes

160

- **ORC**: ORC format bytes

161

- And 50+ more formats supported by ClickHouse

162

163

## Supported Input Formats

164

165

Files can be queried using the `file()` function with format specification:

166

167

- **Parquet**: `file("data.parquet", Parquet)`

168

- **CSV**: `file("data.csv", CSV)`

169

- **JSON**: `file("data.json", JSONEachRow)`

170

- **Arrow**: `file("data.arrow", Arrow)`

171

- **ORC**: `file("data.orc", ORC)`

172

- **TSV**: `file("data.tsv", TabSeparated)`

173

- And 60+ more formats supported by ClickHouse