llm-council

Orchestrate a configurable, multi-member CLI planning council (Codex, Claude Code, Gemini, OpenCode, or custom) to produce independent implementation plans, anonymize and randomize them, then judge and merge into one final plan. Use when you need a robust, bias-resistant planning workflow, structured JSON outputs, retries, and failure handling across multiple CLI agents.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A well-organized, actionable skill body with a sequenced workflow, validation/retry checkpoints, and a clean one-level-deep reference structure backed by real bundle files. The only weakness is duplicated, verbose session-management instructions that pad the token budget.

Suggestions

State the 30-minute session-management rule once (e.g. only in Constraints) and trim the verbose 'Note on Session Management' to a single line, removing the duplicate in workflow step 7.

Collapse the repeated 'DO NOT yield/finish the response until a full 30-minute timer has completed' phrasing into one concise checkpoint.

Dimension	Reasoning	Score
Conciseness	The body is mostly efficient and assumes Claude's competence, but the 30-minute session-management instruction is stated twice (workflow step 7 and the Constraints bullet) plus a verbose 'Note on Session Management', which is repetition that could be tightened; not quite the lean every-token-earns-its-place anchor.	2 / 3
Actionability	Provides concrete copy-paste-ready commands ('python3 scripts/llm_council.py run --spec /path/to/spec.json', 'python3 scripts/llm_council.py configure') and a complete JSON agent-configuration example, matching the fully-executable anchor.	3 / 3
Workflow Clarity	A clearly sequenced 7-step workflow with explicit validation checkpoints ('validate Markdown structure, and retry up to 2 times on failure. If any agent fails, yield and alert the user') and a retry feedback loop, matching the clear-sequence-with-validation anchor.	3 / 3
Progressive Disclosure	The body is an overview with a dedicated References section pointing one level deep to real bundle files (architecture.md, prompts.md, cli-notes.md, templates/*.md, task-spec.example.json) and the scripts, all confirmed present; content is appropriately split with easy navigation.	3 / 3
	Total	11 / 12 Passed

Description

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A specific, third-person description that clearly states both capabilities and an explicit use-trigger with a distinct niche. The main weakness is trigger-term quality: the 'Use when' phrasing leans on internal jargon rather than language users naturally say.

Suggestions

Reword the 'Use when' clause with natural user phrases, e.g. 'Use when you want to compare implementation plans from multiple AI coding agents (Codex, Claude Code, Gemini) before picking one'.

Add common trigger variations users would actually say, such as 'planning', 'compare plans', 'multiple AI agents', 'second opinion on a plan'.

Dimension	Reasoning	Score
Specificity	Names multiple concrete actions — 'produce independent implementation plans, anonymize and randomize them, then judge and merge into one final plan' — alongside a concrete tool set (Codex, Claude Code, Gemini, OpenCode), matching the 'lists multiple specific concrete actions' anchor.	3 / 3
Completeness	Explicitly answers both what (orchestrate/produce/anonymize/judge/merge) and when ('Use when you need a robust, bias-resistant planning workflow...'), satisfying the clearly-answers-both-what-and-when anchor.	3 / 3
Trigger Term Quality	The 'Use when' clause relies on internal jargon like 'bias-resistant planning workflow' and 'CLI agents' rather than phrases a user would naturally say (e.g. 'compare plans from multiple AI models'); it has some relevant keywords but misses common variations, so it is not the level-3 natural-term coverage.	2 / 3
Distinctiveness Conflict Risk	The 'configurable, multi-member CLI planning council' niche with named agents is distinctive and unlikely to trigger for unrelated skills, matching the clear-niche anchor.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 16 / 16 Passed

Validation for skill structure

No warnings or errors.

Repository: am-will/codex-skills
Commit: e343715

Reviewed: 14 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.