Generate publication-quality AI illustrations for academic papers using Gemini image generation. Creates architecture diagrams, method illustrations with Claude-supervised iterative refinement loop. Use when user says "生成图表", "画架构图", "AI绘图", "paper illustration", "generate diagram", or needs visual figures for papers.
73
70%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/paper-illustration/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly communicates specific capabilities (AI illustration generation for academic papers), uses a well-defined toolchain (Gemini with Claude-supervised refinement), and provides excellent bilingual trigger terms. The explicit 'Use when' clause with both Chinese and English trigger phrases ensures reliable skill selection. Minor improvement could include mentioning output formats or additional diagram types.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'Generate publication-quality AI illustrations', 'Creates architecture diagrams', 'method illustrations', and mentions 'Claude-supervised iterative refinement loop'. These are concrete, specific capabilities. | 3 / 3 |
Completeness | Clearly answers both 'what' (generate publication-quality AI illustrations, architecture diagrams, method illustrations with iterative refinement) and 'when' (explicit 'Use when' clause with specific trigger phrases and the general condition 'needs visual figures for papers'). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms in both Chinese and English: '生成图表', '画架构图', 'AI绘图', 'paper illustration', 'generate diagram', 'visual figures for papers'. Covers bilingual user queries and common variations. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche: academic paper illustrations using Gemini image generation with Claude-supervised refinement. The combination of academic context, specific tool (Gemini), and bilingual triggers makes it very unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
39%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill demonstrates excellent workflow design with a clear multi-stage iterative process and thorough validation checkpoints, but is severely undermined by extreme verbosity and repetition. The same style guidelines and visual requirements appear 3-4 times throughout the document. The bash scripts, while concrete, have interpolation issues that would prevent direct execution, and the entire document could be reduced to roughly 1/3 its size by extracting repeated content into referenced files and eliminating redundancy.
Suggestions
Extract the CVPR/NeurIPS style guide into a separate STYLE_GUIDE.md file and reference it once, instead of repeating it in the main content, the prompt template, the style verification step, and the review checklist.
Extract the review checklist template into a separate REVIEW_TEMPLATE.md file, keeping only a brief summary of scoring criteria in the main skill.
Fix the bash scripts' variable interpolation issues - shell variables inside Python heredocs won't expand correctly. Consider using Python scripts directly or proper escaping mechanisms.
Remove bilingual repetition - choose one language for each section rather than saying the same thing in both English and Chinese (e.g., the visual appeal DO/DON'T lists appear twice in different languages).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines with massive repetition. The CVPR style guide is repeated nearly verbatim in the prompt template (Step 1), the style verification (Step 3), and the review checklist (Step 5). The visual appeal section appears in at least 3 places with the same content. The ASCII workflow diagram, while nice, is redundant given the step-by-step instructions that follow. Claude doesn't need explanations of what 'CVPR style' means or bilingual instructions repeated multiple times. | 1 / 3 |
Actionability | Contains executable bash scripts with actual API calls and concrete code, which is good. However, the scripts have significant issues: they use shell variable interpolation inside heredocs/Python strings in ways that won't work correctly (e.g., `$STYLE_SPEC` inside bash heredocs passed to Python), the prompt template in Step 1 has placeholder brackets that Claude must fill in, and the iteration counter in Step 4 is hardcoded to 1 with a comment saying 'Claude increments this' which isn't actionable. The scripts are more pseudocode-like than truly copy-paste ready. | 2 / 3 |
Workflow Clarity | The multi-step workflow is exceptionally clear with explicit sequencing (Steps 0-8), a visual flowchart, explicit validation checkpoints (Step 5 review with detailed checklist), clear decision points (Step 6 with score threshold), and a feedback loop for refinement. The scoring rubric with specific score caps for different failure types provides excellent validation criteria. Error recovery is explicitly addressed. | 3 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no references to external files. The CVPR style guide, prompt templates, review checklists, and bash scripts are all inlined, making the document extremely long. The style guide alone could be a separate reference file, the review template could be a separate file, and the bash scripts could be referenced rather than fully inlined. Everything is crammed into one massive document. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
72%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 8 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (693 lines); consider splitting into references/ and linking | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 8 / 11 Passed | |
dc00dfb
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.