CLI tools execution specification (gemini/claude/codex/qwen/opencode) with unified prompt template, mode options, and auto-invoke triggers for code analysis and implementation tasks. Supports configurable CLI endpoints for analysis, write, and review modes.
55
46%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.codex/skills/ccw-cli-tools/SKILL.mdQuality
Discovery
50%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is moderately informative but leans heavily on implementation jargon rather than user-facing language. It lacks an explicit 'Use when...' clause, which caps completeness. The specific CLI tool names provide some distinctiveness, but the broad scope of 'code analysis and implementation tasks' creates overlap risk with other coding skills.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user wants to run gemini, claude, codex, qwen, or opencode CLI tools for code analysis, writing, or review.'
Replace jargon like 'unified prompt template', 'auto-invoke triggers', and 'configurable CLI endpoints' with concrete user-facing actions such as 'Runs external AI CLI tools to analyze code, generate implementations, or review changes.'
Include natural trigger terms a user might say, such as 'run gemini on this code', 'use codex to review', 'invoke claude CLI', or 'external AI tool'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (CLI tools execution) and lists some actions (code analysis, implementation tasks, analysis/write/review modes), but the description is heavy on architectural jargon ('unified prompt template', 'configurable CLI endpoints', 'auto-invoke triggers') rather than concrete user-facing actions. | 2 / 3 |
Completeness | The 'what' is partially addressed (CLI tools execution with modes), but there is no explicit 'Use when...' clause or equivalent trigger guidance. The 'when' is only vaguely implied through 'auto-invoke triggers for code analysis and implementation tasks', which does not constitute explicit trigger guidance. | 2 / 3 |
Trigger Term Quality | Includes some relevant tool names (gemini, claude, codex, qwen, opencode) and terms like 'code analysis' and 'review modes', but many terms are implementation-specific jargon ('unified prompt template', 'auto-invoke triggers', 'configurable CLI endpoints') rather than natural phrases a user would say. | 2 / 3 |
Distinctiveness Conflict Risk | The specific CLI tool names (gemini, claude, codex, qwen, opencode) provide some distinctiveness, but 'code analysis and implementation tasks' is extremely broad and could overlap with many coding-related skills. The modes (analysis, write, review) are also generic enough to conflict with other skills. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides highly actionable, concrete CLI commands and a well-structured prompt template system, which is its primary strength. However, it suffers severely from redundancy (configuration loading repeated 4+ times, fields documented twice) and monolithic structure that could easily be split into overview + reference files. The document is roughly 3-4x longer than necessary due to repetition and inline reference material.
Suggestions
Eliminate redundant sections: consolidate configuration loading into a single brief mention in Initialization, removing duplicates from Process Flow, Configuration Reference, and Tool Selection Strategy.
Extract reference material into separate files: move the full rule templates list to RULES.md, configuration details to CONFIG.md, and detailed examples to EXAMPLES.md, with one-line links from the main skill.
Remove the duplicated configuration fields listing (appears as both a table and a bullet list in the same Configuration Reference section).
Add explicit validation/verification steps after CLI execution: how to check if output is valid, what error patterns trigger fallback, and how to verify the fallback succeeded.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with significant redundancy. The configuration loading instructions are repeated at least 4 times (Initialization, Process Flow, Configuration Reference, Tool Selection Strategy). The configuration fields table is duplicated. Many sections explain things Claude would already understand (e.g., what JSON fields mean, basic fallback logic). The document is well over 300 lines when it could be under 100. | 1 / 3 |
Actionability | Provides fully executable CLI commands with concrete examples for each mode (security analysis, feature implementation, bug diagnosis, code review). The 6-field prompt template is specific and copy-paste ready, with good/bad examples for each field. Command options are clearly documented with examples. | 3 / 3 |
Workflow Clarity | The process flow diagram and decision tree provide clear sequencing, and the fallback chain is well-defined. However, there are no explicit validation checkpoints after CLI execution (e.g., how to verify output quality, what constitutes a failure that triggers fallback). The auto-invoke triggers lack verification steps for whether the invocation succeeded or produced useful results. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no references to external files for detailed content. The rule templates section lists 15+ templates with no links to their definitions. The configuration structure, examples, and all reference material are inlined, making this extremely long. Content like the full template list, detailed examples, and configuration reference should be in separate files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (560 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
0f8e801
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.