Profile GKG queries against ClickHouse with the query-profiler CLI. For optimizing query performance, comparing query plans, investigating slow queries, or checking ClickHouse resource usage.
68
83%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Run GKG JSON DSL queries directly against any ClickHouse instance and get back execution stats, EXPLAIN plans, CPU/memory profiling, and instance health. No gRPC server or Rails needed.
mise buildSet connection via env vars or a .env file:
export CLICKHOUSE_URL=http://localhost:8123
export CLICKHOUSE_DATABASE=gkg_graph
export CLICKHOUSE_USER=<user>
export CLICKHOUSE_PASSWORD=<password>Or use CLI flags: --ch-url, --ch-database, --ch-user, --ch-password.
Basic search with stats:
mise query:profile -- \
-t '1/' \
'{"query_type":"search","node":{"id":"p","entity":"Project","columns":["name"]},"limit":5}'With EXPLAIN plans:
mise query:profile -- \
-t '1/' --explain \
'{"query_type":"traversal","nodes":[{"id":"mr","entity":"MergeRequest"},{"id":"p","entity":"Project"}],"relationships":[{"type":"IN_PROJECT","from":"mr","to":"p"}],"limit":10}'With CPU/memory profiling from system.query_log:
mise query:profile -- \
-t '1/' --explain --profile --processors \
'{"query_type":"aggregation","nodes":[{"id":"mr","entity":"MergeRequest"},{"id":"p","entity":"Project"}],"relationships":[{"type":"IN_PROJECT","from":"mr","to":"p"}],"aggregations":[{"function":"count","target":"mr","group_by":"p","alias":"mr_count"}],"limit":10}'With instance health snapshot:
mise query:profile -- -t '1/' --health '{"query_type":"search","node":{"id":"p","entity":"Project"},"limit":1}'Multiple traversal paths:
mise query:profile -- -t '1/2/3/' -t '1/4/5/' '{"query_type":"search","node":{"id":"p","entity":"Project"},"limit":5}'Query from file:
mise query:profile -- -t '1/' --explain @fixtures/queries/my_query.jsonBy default output is pretty-printed JSON to stdout. Use --format json for compact JSON.
Write results to a file with --output (-o). Parent directories are created automatically:
mise query:profile -- -t '1/' --explain -o fixtures/profiling/results.json @fixtures/queries/my_query.jsonWhen --output is used, the profiler writes the JSON file and prints the path to stderr. When omitted, results go to stdout. Progress and errors always go to stderr, so they never mix with JSON output.
Result files go in fixtures/profiling/ (gitignored).
Use --output so results are written to disk when the run finishes, then read the output file to analyze results:
mise query:profile -- -t '1/' --explain --profile \
-o fixtures/profiling/showcase.json \
@fixtures/queries/optimization_showcase.json
# Then read fixtures/profiling/showcase.json to analyzeThe JSON output has these sections:
query -- the original query JSONsecurity_context -- org_id and traversal_pathscompilation -- parameterized SQL, rendered SQL, hydration planexecutions -- per-query metrics (1 base + N hydration queries), each with:
label -- "base" or "hydration:Project", etc.query_id -- ClickHouse query ID for system table correlationstats -- read_rows, read_bytes, result_rows, result_bytes, elapsed_ns, memory_usageexplain_plan -- EXPLAIN PLAN with index usage (with --explain)explain_pipeline -- EXPLAIN PIPELINE processor graph (with --explain)query_log -- CPU times, cache stats, ProfileEvents (with --profile)processors -- per-processor pipeline breakdown (with --processors)summary -- totals across all queriesinstance_health -- server health (with --health)The main optimization targets:
read_rows -- total rows scanned. This is the number to reduce.read_bytes -- data volume read from disk/cache.memory_usage -- RAM consumed per query. Watch for spills.elapsed_ns -- server-side wall clock time.With --profile, you also get:
selected_parts / selected_marks -- how well the primary key prunes datamark_cache_hits / mark_cache_misses -- mark cache hit ratereal_time_us vs user_time_us -- tells you if a query is CPU-bound or I/O-boundos_io_wait_us -- time spent waiting on diskexternal_sort_bytes / external_agg_bytes -- nonzero means memory spilled to diskWith --processors:
input_wait_us means a processor is starved for input (upstream is the bottleneck)output_wait_us means a processor is blocked on output (downstream is the bottleneck)With --health:
active_merges -- background merges compete with queries for I/Otemp_files_* -- active memory spills across the servertable_parts -- high part count per table means fragmentation, which slows readsThe profiler can run all queries from a file where keys are query names and values are query objects. Queries are executed sequentially so profiling numbers are not polluted by concurrent load.
mise query:profile -- -t '1/' --explain @fixtures/queries/optimization_showcase.jsonFilter to a subset by name substring:
mise query:profile -- -t '1/' --explain --filter aggregation @fixtures/queries/optimization_showcase.jsonWrite multi-query results to a file:
mise query:profile -- -t '1/' --explain \
-o fixtures/profiling/showcase.json \
@fixtures/queries/optimization_showcase.jsonProgress is printed to stderr ([3/25] query_name...). Output is a JSON object keyed by query name, each value being the standard profiler output. Queries that fail are recorded with an {"error": "..."} value and the run continues with the rest.
mise query:profile -- -t '1/' --explain -o fixtures/profiling/before.json QUERYmise buildmise query:profile -- -t '1/' --explain -o fixtures/profiling/after.json QUERYmise query:diff -- fixtures/profiling/before.json fixtures/profiling/after.json --labels before,aftermise query:diff compares two or more profiler result files and produces a markdown table.
Two-way comparison (default metric is read_rows):
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --labels baseline,dedupCompare memory usage:
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --labels baseline,dedup --metric memoryAll metrics in separate tables:
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --labels baseline,dedup --all-metricsN-way comparison (3+ files):
mise query:diff -- fixtures/profiling/v1.json fixtures/profiling/v2.json fixtures/profiling/v3.json --labels v1,v2,v3CSV output for spreadsheets:
mise query:diff -- fixtures/profiling/baseline.json fixtures/profiling/dedup.json --format csvAvailable metrics: read_rows (default), read_bytes, memory, elapsed_ms.
--profile runs SYSTEM FLUSH LOGS before querying system.query_log, which adds about 100ms.--profile and --health require the ClickHouse user to have SELECT on system tables.json (compact). Use --format pretty for human-readable output.f5efc36
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.