Use when the user wants a local second-opinion code review via OpenAI Codex CLI — on the current branch, staged changes, a single file, or a piped diff. Triggers include "codex review", "review with codex", "run codex on this diff", "second opinion from codex", or pre-commit / pre-PR review requests that ask for codex specifically.
90
97%
Does it follow best practices?
Impact
97%
2.25xAverage score across 2 eval scenarios
Passed
No known issues
Run a local Codex CLI review of changes in this repo with a prompt that encodes the multi-tenant, auth, and migration invariants Codex won't infer on its own. Output is a severity-grouped report (CRITICAL → LOW) with file:line — problem — fix lines.
Do NOT use for: PR-level review on GitHub (that's the Codex GitHub app, not the CLI), or when the user wants Claude to do the review itself.
1. Preflight → Codex installed? Authenticated? Diff non-empty?
2. Scope → choose git diff (branch / staged / file / area)
3. Tailor → append "Focus areas" block when high-risk files touched
4. Run → cat diff | codex exec ... > /tmp/codex-review.out
5. Triage → surface CRITICAL + HIGH to user; verify each before actingSkip any step only if it's a no-op for the scope (e.g. no high-risk files → no Focus areas).
codex --version # else: npm install -g @openai/codex && codex login
tessl whoami 2>/dev/null # unrelated; for codex use: codex loginIf codex --version fails, install + login first. Login is interactive — must run in a real terminal, not via Claude !-shell.
Always write to /tmp/codex-review.diff first (avoids shell-quoting issues with large diffs). Pick scope:
| Scope | git command |
|---|---|
| Current branch vs main | git diff main...HEAD |
| Staged only (pre-commit) | git diff --staged |
| Uncommitted (staged + unstaged) | git diff HEAD |
| Backend or frontend only | git diff main...HEAD -- backend (or -- frontend) |
| Exclude scratch files | append -- ':(exclude,glob)**/_*' ':(exclude)_tmp_*' |
git diff main...HEAD > /tmp/codex-review.diff
test -s /tmp/codex-review.diff || { echo "empty diff — nothing to review"; exit 0; }The test -s guard is mandatory: an empty diff makes Codex hallucinate findings about whatever it reads from the repo.
For 30+ file diffs, split before running — output quality degrades. Two parallel passes on backend/frontend beat one big pass.
Skim git diff --stat. If any of these patterns are touched, append a Focus areas block to the prompt at the CLI:
backend/src/routes/auth.ts, backend/src/middleware/auth.tsbackend/src/routes/admin/**backend/src/db/migrations/** + backend/src/db/templates/company_schema_template.sqlbackend/src/server.ts (middleware order)backend/src/services/scheduler/schedulerRegistrations.ts (cron strings)BedrockClient consumer (prompt-injection surface)Single canonical invocation:
cat /tmp/codex-review.diff \
| codex exec --sandbox read-only "$(cat .claude/skills/codex-review/resources/PROMPT.md)
$FOCUS_AREAS" \
> /tmp/codex-review.outWhere $FOCUS_AREAS is an optional inline block:
## Focus areas in THIS diff
- backend/src/routes/auth.ts — scrutinize token handling and error leakage
- backend/src/db/templates/company_schema_template.sql — confirm matches any new migration tablesNotes:
codex exec has no --file flag — diff goes via stdin.--sandbox read-only lets Codex grep the repo for cross-references without write risk.gpt-5.5 at xhigh reasoning per ~/.codex/config.toml).codex
> review backend/src/routes/auth.ts for auth bypass, timing attacks, and error leakage.
Use the invariants in .claude/skills/codex-review/resources/PROMPT.md.Tell the user to type:
! cat /tmp/codex-review.diff | codex exec "$(cat .claude/skills/codex-review/resources/PROMPT.md)"After the run completes:
/tmp/codex-review.out. Surface every CRITICAL and HIGH to the user — one line each, in plain prose.| Mistake | Fix |
|---|---|
Piping a huge diff inline with codex exec "..." | Write to /tmp/*.diff and pipe via stdin (no --file flag exists) |
Forgetting to pass PROMPT.md | Codex flags style and misses multi-tenant/auth leaks — PROMPT.md is what makes it useful |
| Reviewing 30+ files at once | Split backend/frontend or by service area; quality drops past ~25 files |
| Treating Codex output as authoritative | Second opinion only. Verify CRITICAL/HIGH against the code before acting |
| Running on an empty diff | Codex will invent findings about repo files it reads. Guard with test -s |
| Auto-applying every flagged "fix" | Tests may pin the current behavior on purpose — check the test suite first |