debug-clickhouse-queries

Investigate query evaluation failures in the Knowledge Graph synthetic data pipeline. Use when queries fail or return unexpected results after running the evaluate binary.

1.10x

Quality

80%

Does it follow best practices?

Impact

84%

1.10x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./.claude/skills/debug-clickhouse-queries/SKILL.md

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The content is a strong, actionable debugging guide with executable commands, real source paths, and explicit checkpoints via the numbered checklist. Its main weakness is length — some near-duplicate SQL examples and explanatory prose could be trimmed for token efficiency.

Suggestions

Consolidate the repeated GROUP BY / count(*) diagnostic SQL snippets (they appear in several sections) into one 'data diagnostics' block and reference it, trimming ~250 lines toward a leaner body.

Add an explicit validation checkpoint after the regenerate sequence (e.g., 'Re-run evaluate and confirm the previously empty query now returns rows before considering the fix complete') to make the feedback loop fully explicit.

Dimension	Reasoning	Score
Conciseness	The body is operationally dense and assumes Claude's competence (no basic-concept padding), but at ~250 lines it repeats similar SQL diagnostics and prose that could be tightened, fitting the 'mostly efficient but could be tightened' anchor.	2 / 3
Actionability	Fully executable guidance throughout: copy-paste clickhouse-client queries, concrete 'cargo run -p orbit -- query' invocations with real flags, exact source file paths, and a regenerate command sequence — copy-paste ready.	3 / 3
Workflow Clarity	A clear debugging flow with decision logic for distinguishing bug types, hypothesis-testing branches, and a numbered 7-item checklist provides explicit checkpoints; the only gap is an implicit (not explicit) re-validation step after the destructive 'synth evaluate' regenerate, which keeps it just at 3 rather than being capped.	3 / 3
Progressive Disclosure	No bundle files exist; the single SKILL.md is well-organized into clearly headed sections with a terminal checklist, which per the rubric's simple-skill guidance earns a 3 without external references.	3 / 3
	Total	11 / 12 Passed

Description

75%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description cleanly answers both what and when with a distinct, niche trigger, but its action list is narrow and trigger terms are tightly bound to one binary name. Strengthening it is mostly a matter of broader action and keyword coverage.

Suggestions

List multiple concrete investigative actions (e.g., inspect generated SQL, compare sampled vs. global data, check enum/edge configuration) to lift specificity to 3.

Add common user-facing variations of trigger terms (e.g., 'evaluate report', 'empty results', 'query returned nothing', 'synthetic data pipeline') beyond the single 'evaluate binary' phrasing.

Dimension	Reasoning	Score
Specificity	Names the domain and a concrete action ('Investigate query evaluation failures in the Knowledge Graph synthetic data pipeline'), but lists only one investigative action rather than a comprehensive set of distinct concrete actions, matching the score-2 anchor.	2 / 3
Completeness	Explicitly states what it does (investigate query evaluation failures) and when to use it ('Use when queries fail or return unexpected results after running the evaluate binary'), clearly answering both what and when with an explicit trigger.	3 / 3
Trigger Term Quality	Includes natural phrases a user would say ('queries fail', 'return unexpected results', 'after running the evaluate binary'), but coverage is tied to one binary name and lacks common variations, fitting the 'some relevant keywords but missing common variations' anchor.	2 / 3
Distinctiveness Conflict Risk	The 'Knowledge Graph synthetic data pipeline' niche plus the 'evaluate binary' trigger is a clear, narrow domain unlikely to activate for unrelated skills.	3 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: gitlabhq/orbit-knowledge-graph
Commit: 78fa8a1

Reviewed: about 11 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.