CtrlK
BlogDocsLog inGet started
Tessl Logo

codex

Use when the user asks to run Codex CLI (codex exec, codex resume) or references OpenAI Codex for code analysis, refactoring, or automated editing. Uses GPT-5.2 by default for state-of-the-art software engineering.

68

Quality

83%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with clear workflow sequencing, explicit safety boundaries, and concrete command examples. Its main weakness is moderate verbosity in the model options section, which includes benchmark scores, context window details, and pricing notes that add token cost without proportional value. The error handling and follow-up sections demonstrate good validation practices with AskUserQuestion checkpoints.

Suggestions

Trim the Model Options section significantly—remove SWE-bench scores, context window sizes, and pricing notes, keeping only model names and brief 'best for' descriptions since Claude doesn't need to memorize benchmarks.

Extract the detailed model comparison table into a separate MODELS.md reference file and link to it from the main skill.

DimensionReasoningScore

Conciseness

The skill includes some unnecessary detail like the model comparison table with context windows, SWE-bench scores, and pricing notes that Claude doesn't need to memorize. The reasoning effort descriptions are somewhat redundant. However, the core workflow instructions and quick reference table are efficient.

2 / 3

Actionability

The skill provides concrete, copy-paste-ready commands with specific flags, a clear quick reference table mapping use cases to exact command patterns, and explicit syntax for resume operations including stdin piping. The guidance is specific and executable throughout.

3 / 3

Workflow Clarity

The multi-step process is clearly sequenced (select model → choose sandbox → assemble command → run → summarize → offer resume). Safety boundaries are explicit, error handling includes stop-and-report behavior, and there are validation checkpoints like asking user permission before high-impact flags and confirming next steps after every command via AskUserQuestion.

3 / 3

Progressive Disclosure

The content is well-organized with clear sections and a useful quick reference table, but the model options table and detailed pricing/benchmark information could be split into a separate reference file. For a standalone skill with no bundle files, the inline content is somewhat heavy, though the section headers provide reasonable navigation.

2 / 3

Total

10

/

12

Passed

Description

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a solid description with excellent trigger terms and completeness, clearly specifying both when to use the skill and what it does. The main weakness is that the capability descriptions ('code analysis, refactoring, or automated editing') are somewhat generic and could be more concrete about specific operations the skill performs. Overall it would perform well in a multi-skill selection scenario due to its distinctive tool-specific triggers.

Suggestions

Add more specific concrete actions beyond the generic 'code analysis, refactoring, or automated editing'—e.g., 'generates patches, applies multi-file edits, reviews pull requests' to improve specificity.

DimensionReasoningScore

Specificity

Names the domain (Codex CLI) and some actions (code analysis, refactoring, automated editing), but doesn't list multiple concrete specific actions—'code analysis, refactoring, or automated editing' is somewhat generic and not comprehensive about what the skill actually does step-by-step.

2 / 3

Completeness

Explicitly answers both 'what' (runs Codex CLI for code analysis, refactoring, automated editing using GPT-5.2) and 'when' (when user asks to run codex exec/resume or references OpenAI Codex) with a clear 'Use when...' clause.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'Codex CLI', 'codex exec', 'codex resume', 'OpenAI Codex', 'code analysis', 'refactoring', 'automated editing'. These cover both command-specific and conceptual terms a user would naturally say.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with specific tool references (Codex CLI, codex exec, codex resume, OpenAI Codex, GPT-5.2) that clearly distinguish it from generic coding or refactoring skills. Unlikely to conflict with other skills.

3 / 3

Total

11

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
jdrhyne/agent-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.