0
# Command Line Interface
1
2
Interactive command-line tool for executing SQL queries against Druid with syntax highlighting, autocompletion, and tabular result display. The CLI provides a convenient way to explore Druid data and test queries interactively.
3
4
## Capabilities
5
6
### CLI Entry Point
7
8
The command-line interface is available through the `pydruid` command after installation.
9
10
```python { .api }
11
def main() -> None:
12
"""
13
Main entry point for the PyDruid command-line interface.
14
15
Provides an interactive SQL shell for executing queries against Druid
16
with syntax highlighting, autocompletion, and formatted output.
17
"""
18
```
19
20
### Command Execution
21
22
```bash
23
# Basic usage
24
pydruid http://localhost:8082/druid/v2/sql/
25
26
# With HTTPS
27
pydruid https://localhost:8082/druid/v2/sql/
28
29
# With authentication (will prompt for credentials)
30
pydruid https://user@localhost:8082/druid/v2/sql/
31
```
32
33
### Interactive Features
34
35
The CLI provides several interactive features:
36
37
- **Syntax Highlighting**: SQL keywords, functions, and strings are highlighted
38
- **Autocompletion**: Tab completion for SQL keywords, functions, and table names
39
- **Command History**: Persistent command history across sessions
40
- **Formatted Output**: Results displayed in formatted tables
41
- **Multi-line Support**: Queries can span multiple lines
42
43
### SQL Keywords and Functions
44
45
The CLI provides autocompletion for comprehensive SQL syntax.
46
47
```python { .api }
48
# SQL Keywords
49
keywords: list = [
50
"EXPLAIN PLAN FOR", "WITH", "SELECT", "ALL", "DISTINCT",
51
"FROM", "WHERE", "GROUP BY", "HAVING", "ORDER BY",
52
"ASC", "DESC", "LIMIT", "FILTER", "UNION ALL"
53
]
54
55
# Aggregate Functions
56
aggregate_functions: list = [
57
"COUNT", "SUM", "MIN", "MAX", "AVG",
58
"APPROX_COUNT_DISTINCT", "APPROX_QUANTILE"
59
]
60
61
# Numeric Functions
62
numeric_functions: list = [
63
"ABS", "CEIL", "EXP", "FLOOR", "LN", "LOG10",
64
"POWER", "SQRT", "TRUNCATE", "TRUNC", "MOD"
65
]
66
67
# String Functions
68
string_functions: list = [
69
"LENGTH", "CHAR_LENGTH", "CHARACTER_LENGTH", "STRLEN", "LOOKUP",
70
"LOWER", "REGEXP_EXTRACT", "REPLACE", "STRPOS", "SUBSTRING",
71
"SUBSTR", "TRIM", "BTRIM", "RTRIM", "LTRIM", "UPPER"
72
]
73
74
# Time Functions
75
time_functions: list = [
76
"CURRENT_TIMESTAMP", "CURRENT_DATE", "DATE_TRUNC", "TIME_FLOOR",
77
"TIME_SHIFT", "TIME_EXTRACT", "TIME_PARSE", "TIME_FORMAT",
78
"MILLIS_TO_TIMESTAMP", "TIMESTAMP_TO_MILLIS", "EXTRACT",
79
"FLOOR", "CEIL", "TIMESTAMPADD"
80
]
81
82
# Other Functions
83
other_functions: list = [
84
"CASE", "COALESCE", "NULLIF", "CAST", "SAFE_DIVIDE"
85
]
86
```
87
88
## Usage Examples
89
90
### Basic Usage
91
92
```bash
93
$ pydruid http://localhost:8082/druid/v2/sql/
94
PyDruid CLI - Interactive Druid SQL Client
95
Connected to: http://localhost:8082/druid/v2/sql/
96
Type 'help' for help, 'quit' or 'exit' to quit.
97
98
> SELECT COUNT(*) AS cnt FROM places
99
cnt
100
-----
101
12345
102
103
> SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES;
104
TABLE_NAME
105
----------
106
test_table
107
twitterstream
108
places
109
COLUMNS
110
SCHEMATA
111
TABLES
112
113
> quit
114
GoodBye!
115
```
116
117
### Multi-line Queries
118
119
```sql
120
> SELECT place,
121
CAST(REGEXP_EXTRACT(place, '(.*),', 1) AS FLOAT) AS lat,
122
CAST(REGEXP_EXTRACT(place, ',(.*)', 1) AS FLOAT) AS lon
123
FROM places
124
WHERE place IS NOT NULL
125
LIMIT 5;
126
127
place lat lon
128
--------- ----- -----
129
"40.7128,-74.0060" 40.71 -74.01
130
"34.0522,-118.2437" 34.05 -118.24
131
"41.8781,-87.6298" 41.88 -87.63
132
"29.7604,-95.3698" 29.76 -95.37
133
"33.4484,-112.0740" 33.45 -112.07
134
```
135
136
### Complex Analytical Queries
137
138
```sql
139
> SELECT
140
user_lang,
141
COUNT(*) as tweet_count,
142
AVG(LENGTH(tweet_text)) as avg_length
143
FROM twitterstream
144
WHERE __time >= '2014-03-01'
145
AND __time < '2014-04-01'
146
GROUP BY user_lang
147
ORDER BY tweet_count DESC
148
LIMIT 10;
149
150
user_lang tweet_count avg_length
151
--------- ----------- ----------
152
en 1,234,567 142.5
153
es 234,567 156.2
154
pt 123,456 148.9
155
fr 98,765 151.3
156
ja 87,654 67.8
157
...
158
```
159
160
### Schema Exploration
161
162
```sql
163
> DESCRIBE twitterstream;
164
165
COLUMN_NAME DATA_TYPE IS_NULLABLE
166
----------- --------- -----------
167
__time TIMESTAMP NO
168
tweet_id BIGINT YES
169
user_name VARCHAR YES
170
tweet_text VARCHAR YES
171
user_lang VARCHAR YES
172
retweet_count BIGINT YES
173
favorite_count BIGINT YES
174
175
> SELECT * FROM INFORMATION_SCHEMA.COLUMNS
176
WHERE TABLE_NAME = 'twitterstream'
177
LIMIT 5;
178
179
TABLE_SCHEMA TABLE_NAME COLUMN_NAME DATA_TYPE
180
------------ ---------- ----------- ---------
181
druid twitterstream __time TIMESTAMP
182
druid twitterstream tweet_id BIGINT
183
druid twitterstream user_name VARCHAR
184
druid twitterstream tweet_text VARCHAR
185
druid twitterstream user_lang VARCHAR
186
```
187
188
### Time-based Analysis
189
190
```sql
191
> SELECT
192
FLOOR(__time TO HOUR) as hour,
193
COUNT(*) as tweets_per_hour
194
FROM twitterstream
195
WHERE __time >= '2014-03-02'
196
AND __time < '2014-03-03'
197
GROUP BY FLOOR(__time TO HOUR)
198
ORDER BY hour;
199
200
hour tweets_per_hour
201
---- ---------------
202
2014-03-02T00:00:00Z 15,432
203
2014-03-02T01:00:00Z 12,876
204
2014-03-02T02:00:00Z 8,543
205
2014-03-02T03:00:00Z 6,234
206
...
207
```
208
209
### Query Performance Analysis
210
211
```sql
212
> EXPLAIN PLAN FOR
213
SELECT user_name, COUNT(*) as tweet_count
214
FROM twitterstream
215
WHERE user_lang = 'en'
216
GROUP BY user_name
217
ORDER BY tweet_count DESC
218
LIMIT 10;
219
220
[
221
{
222
"query": {
223
"queryType": "groupBy",
224
"dataSource": "twitterstream",
225
"intervals": ["1000-01-01/3000-01-01"],
226
"granularity": "all",
227
"dimensions": ["user_name"],
228
"filter": {
229
"type": "selector",
230
"dimension": "user_lang",
231
"value": "en"
232
},
233
"aggregations": [
234
{
235
"type": "count",
236
"name": "tweet_count"
237
}
238
],
239
"limitSpec": {
240
"type": "default",
241
"limit": 10,
242
"orderBy": [
243
{
244
"dimension": "tweet_count",
245
"direction": "descending",
246
"dimensionOrder": "numeric"
247
}
248
]
249
}
250
}
251
}
252
]
253
```
254
255
### Special Commands
256
257
```sql
258
> help
259
Available commands:
260
- help: Show this help message
261
- quit, exit, bye: Exit the CLI
262
- \d: List all tables
263
- \d <table>: Describe table structure
264
265
> \d
266
Tables:
267
- twitterstream
268
- places
269
- user_profiles
270
271
> \d twitterstream
272
Table: twitterstream
273
Columns:
274
- __time (TIMESTAMP)
275
- tweet_id (BIGINT)
276
- user_name (VARCHAR)
277
- tweet_text (VARCHAR)
278
- user_lang (VARCHAR)
279
- retweet_count (BIGINT)
280
- favorite_count (BIGINT)
281
```
282
283
## Installation and Dependencies
284
285
The CLI requires additional dependencies for full functionality:
286
287
```bash
288
# Install with CLI dependencies
289
pip install pydruid[cli]
290
291
# Or install dependencies manually
292
pip install pygments prompt_toolkit tabulate
293
```
294
295
Dependencies:
296
- **pygments**: Syntax highlighting for SQL
297
- **prompt_toolkit**: Interactive input with autocompletion and history
298
- **tabulate**: Formatted table output for query results
299
300
## Configuration
301
302
The CLI uses several configuration options:
303
304
### History
305
306
Command history is automatically saved to `~/.pydruid_history` and persists across sessions.
307
308
### Syntax Highlighting
309
310
SQL syntax highlighting is enabled by default and includes:
311
- Keywords (SELECT, FROM, WHERE, etc.)
312
- Functions (COUNT, SUM, CONCAT, etc.)
313
- Strings and numeric literals
314
- Comments
315
316
### Output Formatting
317
318
Query results are automatically formatted as tables using the `tabulate` library, with:
319
- Column headers
320
- Aligned columns
321
- Configurable table styles
322
- Support for wide tables with proper wrapping
323
324
### Connection Options
325
326
The CLI supports the same connection options as the database API:
327
- HTTP and HTTPS protocols
328
- Basic authentication
329
- SSL certificate verification
330
- Custom headers and context parameters
331
332
Example with authentication:
333
```bash
334
pydruid https://username:password@druid.example.com:8082/druid/v2/sql/
335
```
336
337
The CLI provides a convenient and user-friendly way to interact with Druid for data exploration, query development, and ad-hoc analysis tasks.