Generate publication-quality academic illustrations through a local Codex app-server bridge that uses Codex native image generation. This is a separate experimental alternative to `paper-illustration`, intended for Claude Code users who want a GPT-image-style renderer without modifying the original skill.
66
60%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/paper-illustration-image2/SKILL.mdQuality
Discovery
57%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description establishes a clear niche and differentiates itself well from a related skill, but it lacks explicit trigger guidance ('Use when...') and concrete action verbs beyond 'generate'. It over-indexes on architectural details and positioning relative to another skill rather than describing what it does and when to use it.
Suggestions
Add an explicit 'Use when...' clause with natural trigger terms like 'academic figure', 'scientific diagram', 'paper illustration', 'research figure', or 'Codex image generation'.
List specific concrete actions beyond 'generate illustrations', such as 'create diagrams, render scientific figures, produce labeled schematics for research papers'.
Reduce architectural jargon ('app-server bridge') and focus on user-facing capabilities and scenarios that would naturally trigger skill selection.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | It names the domain ('publication-quality academic illustrations') and mentions the mechanism ('local Codex app-server bridge', 'Codex native image generation'), but doesn't list multiple concrete actions beyond 'generate illustrations'. The description focuses more on architecture than specific capabilities. | 2 / 3 |
Completeness | The 'what' is partially addressed (generate academic illustrations via Codex bridge), and there's an implicit 'when' (for Claude Code users wanting GPT-image-style rendering), but there is no explicit 'Use when...' clause with trigger guidance. The description focuses more on differentiating from 'paper-illustration' than on when to use it. | 2 / 3 |
Trigger Term Quality | Includes some relevant terms like 'academic illustrations', 'publication-quality', 'GPT-image-style', and 'Codex', but misses many natural user terms like 'diagram', 'figure', 'scientific figure', 'paper figure', 'research illustration'. The technical terms like 'app-server bridge' are not things users would naturally say. | 2 / 3 |
Distinctiveness Conflict Risk | The description explicitly differentiates itself from the 'paper-illustration' skill and specifies a unique mechanism (Codex app-server bridge). It carves out a clear niche as an experimental alternative with a distinct technical approach, making it unlikely to conflict with other skills. | 3 / 3 |
Total | 9 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability and workflow clarity with concrete commands, explicit validation gates, and a well-defined iterative refinement loop. However, it is significantly over-verbose—the extensive conference style guide, ASCII diagram, and repeated guidance on academic figure aesthetics are things Claude already knows well. The content would benefit from aggressive trimming and splitting the style guide into a referenced file.
Suggestions
Remove or drastically condense the CVPR/ICLR/NeurIPS style guide section (~80 lines) to a brief 5-line summary—Claude already knows academic figure conventions. If the detail is truly needed, extract it to a separate STYLE_GUIDE.md.
Remove the large ASCII workflow diagram (30+ lines) and replace with a 3-4 line text summary of the pipeline stages—the detailed Steps 0-7 section already covers this clearly.
Extract the 'Visual Appeal' subsection with mixed Chinese/English text; either keep it English-only for consistency or move it to a referenced file.
Consolidate the repair path section with Step 7 since they contain nearly identical commands, reducing duplication.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~250+ lines. The large ASCII workflow diagram, the extensive style guide with Chinese text mixed in, repeated lists of do's/don'ts, and the detailed conference style section all consume significant tokens. Much of this (what makes a good academic figure, arrow best practices, font choices) is knowledge Claude already possesses. The repair path section largely duplicates finalize commands shown earlier. | 1 / 3 |
Actionability | The skill provides fully concrete, executable commands throughout: specific bash commands with flags for preflight/finalize/verify, exact MCP tool call parameters (generate_start with prompt, cwd, outputPath, system, timeoutSeconds), specific file naming conventions (figure_v1.png, figure_v2.png), and a complete LaTeX snippet. Every step has copy-paste ready instructions. | 3 / 3 |
Workflow Clarity | The multi-step workflow is clearly sequenced (Steps 0-7) with explicit validation checkpoints: preflight must return ok=true before proceeding, visual review scores must reach ≥9, verify must pass before claiming success. There's a clear feedback loop (score < 9 → refine → re-render) and a repair path for when artifacts are missed. The pre-flight checklist is an excellent touch. | 3 / 3 |
Progressive Disclosure | The content is a monolithic wall of text with no references to external files despite being complex enough to warrant splitting (e.g., the CVPR style guide could be a separate file, the repair path could be separate). The scope table and model summary provide some structure, but the inline style guide alone is ~60 lines that could be extracted. No bundle files are provided to offload content to. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
2028ac4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.