Workflow 3: Full paper writing pipeline. Orchestrates paper-plan → paper-figure → figure-spec/paper-illustration/mermaid-diagram → paper-write → paper-compile → auto-paper-improvement-loop to go from a narrative report to a polished PDF. At `— effort: max | beast` (or explicit `— assurance: submission`), Phase 6 gates the Final Report on `tools/verify_paper_audits.sh`; the PDF is labelled `submission-ready` only when the external verifier is green. Use when user says "写论文全流程", "write paper pipeline", "从报告到PDF", "paper writing", or wants the complete paper generation workflow.
82
81%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Critical
Do not install without reviewing
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that thoroughly covers what the skill does (orchestrates a complete paper writing pipeline with specific named phases and quality gating) and when to use it (explicit trigger terms in both English and Chinese). The description is technically detailed without being vague, and its specificity about the pipeline stages and submission-readiness gating makes it highly distinctive from related but narrower skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions and tools in the pipeline: paper-plan, paper-figure, figure-spec, paper-illustration, mermaid-diagram, paper-write, paper-compile, auto-paper-improvement-loop. Also describes specific gating behavior with verify_paper_audits.sh and submission-ready labeling. | 3 / 3 |
Completeness | Clearly answers both 'what' (orchestrates a full pipeline from narrative report to polished PDF with specific phases and gating) and 'when' (explicit 'Use when' clause with multiple trigger phrases). The when clause is explicit and well-defined. | 3 / 3 |
Trigger Term Quality | Includes natural trigger terms in both Chinese and English: '写论文全流程', 'write paper pipeline', '从报告到PDF', 'paper writing', 'complete paper generation workflow'. Good coverage of how users would naturally phrase this request across languages. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive as it describes a specific multi-phase orchestration workflow (Workflow 3) with named sub-skills and a unique gating mechanism. The combination of pipeline orchestration, submission-ready gating, and bilingual triggers makes it clearly distinguishable from individual paper-writing or figure-generation skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill excels at actionability and workflow clarity with precise commands, validation gates, and well-sequenced phases. However, it is severely over-verbose — the content could likely be cut by 50-60% without losing any actionable information. Empirical anecdotes, extensive rationale sections, and deeply detailed assurance resolution logic inflate the token cost far beyond what Claude needs to execute the pipeline.
Suggestions
Cut empirical motivation paragraphs (e.g., 'in our April 2026 NeurIPS run...') — these explain *why* but don't help Claude *execute*; move to a separate design-rationale.md if needed for human readers.
Extract Phase 6.0's submission gate logic (assurance resolution, pre-flight checklist, verifier invocation, optional hardening) into a separate file like `submission-gate.md` and reference it with a one-line link.
Remove explanatory prose that restates what sub-skills do (e.g., 'This invokes GPT-5.4 xhigh to: Verify all proof steps...') — Claude will read the sub-skill's own SKILL.md when invoked.
Consolidate the Phase 2b illustration mode descriptions into a compact table rather than four verbose conditional blocks with repeated output/best-for descriptions.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~500+ lines. Contains extensive explanations of concepts Claude can infer (e.g., what each audit does, empirical motivations from April 2026, detailed explanations of assurance levels). The Phase 6 submission gate alone is massively over-detailed with resolution logic, escape hatches, and optional hardening sections that bloat the token budget significantly. | 1 / 3 |
Actionability | Highly actionable with specific commands for each phase, executable bash scripts for detection logic, concrete invocation patterns, exact file paths, and copy-paste ready code blocks. Each phase has clear inputs, outputs, and specific tool invocations. | 3 / 3 |
Workflow Clarity | Excellent multi-step workflow with clearly numbered phases (0-6), explicit validation checkpoints between phases, feedback loops (fix → re-validate → proceed), conditional skip logic with detectors, and a formal submission gate with verifier exit codes. Error recovery paths are well-defined throughout. | 3 / 3 |
Progressive Disclosure | References shared protocols via links (output-versioning.md, output-manifest.md, etc.) which is good, but the main file itself is monolithic with enormous inline detail that should be split into sub-references. The Phase 6 submission gate content, empirical motivations, and optional hardening sections could easily be in separate files to keep the main skill lean. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
72%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 8 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (611 lines); consider splitting into references/ and linking | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 8 / 11 Passed | |
700fbe2
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.