llm-council

Orchestrate a configurable, multi-member CLI planning council (Codex, Claude Code, Gemini, OpenCode, or custom) to produce independent implementation plans, anonymize and randomize them, then judge and merge into one final plan. Use when you need a robust, bias-resistant planning workflow, structured JSON outputs, retries, and failure handling across multiple CLI agents.

Quality

71%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/llm-council/SKILL.md

Quality

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description that clearly articulates a unique, specific capability with explicit trigger guidance. It covers both what the skill does and when to use it, and its niche is distinctive enough to avoid conflicts. The main weakness is that trigger terms lean toward technical jargon that users may not naturally use when requesting this kind of workflow.

Suggestions

Add more natural trigger terms users might say, such as 'multi-agent planning', 'compare plans from different AI models', 'blind evaluation of proposals', or 'council of experts'.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: orchestrate a multi-member CLI planning council, produce independent implementation plans, anonymize and randomize them, judge and merge into one final plan. Also mentions structured JSON outputs, retries, and failure handling.	3 / 3
Completeness	Clearly answers both what ('orchestrate a configurable, multi-member CLI planning council to produce independent implementation plans, anonymize and randomize them, then judge and merge into one final plan') and when ('Use when you need a robust, bias-resistant planning workflow, structured JSON outputs, retries, and failure handling across multiple CLI agents').	3 / 3
Trigger Term Quality	Includes some relevant keywords like 'planning council', 'Codex', 'Claude Code', 'Gemini', 'OpenCode', 'CLI agents', 'implementation plans', but these are fairly specialized terms. Users might more naturally say 'plan comparison', 'multi-agent planning', or 'council of experts' — some common variations are missing.	2 / 3
Distinctiveness Conflict Risk	This is a very specific niche — multi-agent CLI planning councils with anonymization and merging. It is highly unlikely to conflict with other skills due to its unique combination of features and specific tool names.	3 / 3
	Total	11 / 12 Passed

Implementation

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a reasonable overview of a complex multi-agent planning workflow with good structural organization and clear references to supporting files. Its main weaknesses are redundancy (session management and intake questions stated twice) and insufficient actionability in the core workflow steps — the actual mechanics of prompt building, anonymization, and judging lack concrete executable examples. The workflow would benefit from explicit validation checkpoints between critical steps.

Suggestions

Remove duplicate content: consolidate the session management rule and intake question guidance to appear only once, reducing token waste.

Add concrete, executable examples for key workflow steps — especially prompt template generation, anonymization, and judge invocation — rather than relying entirely on references.

Add explicit validation checkpoints between workflow steps (e.g., 'Verify all planner outputs exist and are valid Markdown before proceeding to anonymization').

Dimension	Reasoning	Score
Conciseness	The content is mostly efficient but has some redundancy — the 30-minute session management rule is stated twice (in Workflow step 7 and in Constraints), and the intake question guidance is repeated across Quick Start and Workflow. Some explanatory notes could be tightened, but overall it doesn't over-explain concepts Claude already knows.	2 / 3
Actionability	Provides concrete CLI commands (`python3 scripts/llm_council.py run --spec /path/to/spec.json`) and a full JSON config example, which is good. However, the workflow steps are more procedural descriptions than fully executable guidance — there's no concrete code for building planner prompts, collecting outputs, anonymizing, or running the judge. Key operational details are deferred to reference files without inline examples.	2 / 3
Workflow Clarity	The 7-step workflow provides a clear sequence, and step 4 includes a retry mechanism with failure handling. However, validation checkpoints are weak — there's no explicit 'verify X before proceeding to Y' pattern beyond step 4's Markdown structure check. The session management instruction in step 7 is important but mixes operational concerns with workflow steps, reducing clarity.	2 / 3
Progressive Disclosure	The skill has a clean structure with Quick Start, Workflow, Agent Configuration, References, and Constraints sections. References are clearly signaled and one level deep (architecture.md, prompts.md, templates/*.md, cli-notes.md). Content is appropriately split between the overview and reference files.	3 / 3
	Total	9 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: am-will/codex-skills
Commit: d3983b1

Reviewed: 24 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.