codex

Use when the user asks to run Codex CLI (codex exec, codex resume) or references OpenAI Codex for code analysis, refactoring, or automated editing. Uses GPT-5.2 by default for state-of-the-art software engineering.

Quality

83%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description with excellent trigger terms and completeness, clearly specifying when Claude should select this skill via explicit command names and a 'Use when' clause. The main weakness is that the capability description could be more specific about what concrete actions the skill performs beyond the generic 'code analysis, refactoring, or automated editing'. The mention of GPT-5.2 adds useful context but the core actions could be more detailed.

Suggestions

Add more specific concrete actions the skill performs, e.g., 'Runs Codex CLI to generate code patches, apply multi-file edits, analyze codebases, and refactor functions' instead of the generic 'code analysis, refactoring, or automated editing'.

Dimension	Reasoning	Score
Specificity	Names the domain (Codex CLI) and some actions (code analysis, refactoring, automated editing), but doesn't list multiple concrete specific actions—'code analysis, refactoring, or automated editing' is somewhat generic and not comprehensive about what the skill actually does step-by-step.	2 / 3
Completeness	Explicitly answers both 'what' (runs Codex CLI for code analysis, refactoring, automated editing using GPT-5.2) and 'when' (when user asks to run Codex CLI commands or references OpenAI Codex) with a clear 'Use when...' clause.	3 / 3
Trigger Term Quality	Includes strong natural trigger terms: 'Codex CLI', 'codex exec', 'codex resume', 'OpenAI Codex', 'code analysis', 'refactoring', 'automated editing'. These cover specific commands and natural language variations a user would say.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive—references specific tool names (Codex CLI, codex exec, codex resume, OpenAI Codex) that are unlikely to conflict with other skills. The niche is clearly defined around a specific CLI tool.	3 / 3
	Total	11 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with clear workflow sequencing, concrete commands, and good safety boundaries. Its main weakness is including model benchmark data, context window sizes, and pricing guidance that add bulk without being essential to the core task execution. The structure is good but could benefit from splitting reference material into separate files.

Suggestions

Move the Model Options table and benchmark/pricing details into a separate MODELS.md reference file, keeping only the default model recommendation and a link in the main skill.

Remove or condense the reasoning effort level descriptions—Claude can infer what 'high' vs 'low' means from context without explicit definitions for each level.

Dimension	Reasoning	Score
Conciseness	The skill includes some unnecessary information like the model comparison table with context windows, SWE-bench scores, and pricing notes that Claude doesn't need to memorize. The reasoning effort descriptions are somewhat redundant. However, the core workflow instructions are reasonably tight.	2 / 3
Actionability	The skill provides concrete, executable commands with specific flags, a clear quick reference table mapping use cases to exact command patterns, and precise syntax for resume operations including stdin piping. Commands are copy-paste ready.	3 / 3
Workflow Clarity	The multi-step process is clearly sequenced (select model → choose sandbox → assemble command → run → summarize → offer resume). Safety boundaries serve as validation checkpoints, error handling includes explicit stop-and-report behavior, and the follow-up section creates a feedback loop with AskUserQuestion after every command.	3 / 3
Progressive Disclosure	The content is well-structured with clear sections and a useful quick reference table, but the model options table and detailed pricing/benchmark information could be split into a separate reference file. The skill is somewhat long for a single file with no external references.	2 / 3
	Total	10 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: jdrhyne/agent-skills
Commit: 6768672

Reviewed: about 2 months ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.