CLI tools execution specification (gemini/claude/codex/qwen/opencode) with unified prompt template, mode options, and auto-invoke triggers for code analysis and implementation tasks. Supports configurable CLI endpoints for analysis, write, and review modes.
55
46%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.codex/skills/ccw-cli-tools/SKILL.mdQuality
Discovery
50%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description identifies a specific domain (multi-LLM CLI tool orchestration) and names concrete tools, which helps with identification. However, it reads more like an internal architecture spec than a skill description—heavy on implementation details ('unified prompt template', 'configurable CLI endpoints') and lacking an explicit 'Use when...' clause that would help Claude know when to select it.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to run gemini, claude, codex, qwen, or opencode CLI commands for code analysis, writing, or review.'
Replace internal jargon ('unified prompt template', 'auto-invoke triggers', 'configurable CLI endpoints') with user-facing language describing what the skill actually does for the user.
Include natural trigger terms users might say, such as 'run codex', 'ask gemini', 'code review with claude', 'multi-model comparison', etc.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (CLI tools execution) and lists some actions (code analysis, implementation tasks, analysis/write/review modes), but the description is heavy on architectural jargon ('unified prompt template', 'configurable CLI endpoints') rather than concrete user-facing actions. | 2 / 3 |
Completeness | The 'what' is partially addressed (CLI tools execution with modes), but there is no explicit 'Use when...' clause or equivalent trigger guidance telling Claude when to select this skill. The 'auto-invoke triggers' phrase hints at when but doesn't spell it out. | 2 / 3 |
Trigger Term Quality | Includes some natural keywords like 'gemini', 'claude', 'codex', 'qwen', 'opencode', 'code analysis', and 'review', but many terms are technical/internal ('auto-invoke triggers', 'unified prompt template', 'configurable CLI endpoints') rather than what users would naturally say. | 2 / 3 |
Distinctiveness Conflict Risk | The mention of specific CLI tool names (gemini, claude, codex, qwen, opencode) provides some distinctiveness, but 'code analysis and implementation tasks' is broad enough to overlap with many coding-related skills. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides highly actionable, concrete CLI guidance with executable examples and a well-structured prompt template system. However, it is severely undermined by extreme verbosity and redundancy—configuration loading is explained three times, mode permissions are restated repeatedly, and the entire document could be cut by 50%+ without losing information. The lack of bundle files means everything is crammed into one massive document with poor progressive disclosure.
Suggestions
Eliminate redundancy: consolidate the configuration loading instructions into a single section instead of repeating them in Initialization, Configuration Reference, and Tool Selection Strategy.
Extract the detailed prompt field specifications (PURPOSE, TASK, MODE, CONTEXT, EXPECTED, CONSTRAINTS with good/bad examples) into a separate PROMPT-TEMPLATE.md reference file.
Move the full task-type examples and rule template catalog into separate bundle files (e.g., EXAMPLES.md, RULES.md) and reference them from the main skill.
Add explicit validation checkpoints: define what constitutes execution failure, how to verify output quality, and when to proceed vs. retry in the fallback chain.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~400+ lines with significant redundancy. The configuration loading instructions are repeated at least 3 times (Initialization, Configuration Reference, Tool Selection Strategy). The Configuration Fields section duplicates the table above it. Concepts like mode permissions are restated multiple times. Many sections explain things Claude would already understand (e.g., what read-only means, basic fallback logic). | 1 / 3 |
Actionability | Provides fully executable CLI commands with concrete examples for each use case. The 6-field prompt template is well-specified with good/bad examples. Command options are documented with specific flags and values. The task-type examples are copy-paste ready with realistic parameters. | 3 / 3 |
Workflow Clarity | The process flow diagram and decision tree provide clear sequencing, and the fallback chain is well-defined. However, there are no explicit validation checkpoints after CLI execution (e.g., how to verify output quality, what constitutes a 'failure' triggering fallback). The auto-invoke triggers lack verification steps to confirm the analysis results before acting on them. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no bundle files to offload content to. The rule templates list, configuration reference, all examples, and the full prompt template specification are all inline. Content like the detailed prompt field specifications, the full examples section, and the rule template catalog should be in separate referenced files. References to external files (cli-tools.json, CLAUDE.md, cli-tools-usage.md) exist but no bundle files support them. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (560 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
227244f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.