Full research pipeline: Workflow 1 (idea discovery) → Workflow 1.5 (experiment bridge) → Workflow 2 (auto review loop) → Workflow 3 (paper writing, optional). Goes from a broad research direction all the way to a polished PDF. Use when user says "全流程", "full pipeline", "从找idea到投稿", "end-to-end research", or wants the complete autonomous research lifecycle.
71
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly communicates a multi-stage research pipeline with specific workflow stages, concrete outputs, and explicit trigger terms in both English and Chinese. It effectively distinguishes itself from individual workflow skills by emphasizing the end-to-end nature. The inclusion of bilingual triggers is a notable strength for multilingual user bases.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions organized as a pipeline: idea discovery, experiment bridge, auto review loop, and paper writing. Also specifies the output ('polished PDF') and the progression ('from a broad research direction all the way to a polished PDF'). | 3 / 3 |
Completeness | Clearly answers both 'what' (full research pipeline from idea discovery through paper writing) and 'when' (explicit 'Use when' clause with specific trigger phrases). Both dimensions are well-covered. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms in both English and Chinese: '全流程', 'full pipeline', '从找idea到投稿', 'end-to-end research', 'complete autonomous research lifecycle'. These cover natural variations a user would actually say. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive as it describes a full end-to-end research pipeline with numbered workflow stages. The specific workflow numbering (1, 1.5, 2, 3) and bilingual trigger terms create a clear niche that is unlikely to conflict with individual workflow skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured orchestration skill with excellent actionability and workflow clarity — every stage has concrete commands, clear gates, and explicit validation checkpoints. The main weakness is verbosity: the Constants section is heavy, and stage descriptions duplicate sub-skill documentation. Progressive disclosure would benefit from offloading detailed stage internals to the referenced sub-skills rather than re-explaining them inline.
Suggestions
Trim the Stage 2 description — the 7-step breakdown of what /experiment-bridge does internally is likely already documented in that sub-skill; replace with a 1-2 line summary and a reference.
Consider moving the Constants section to a separate PIPELINE_CONFIG.md reference file, keeping only a brief table of flag names and defaults in the main SKILL.md with a link to full descriptions.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is quite long (~200+ lines) with some redundancy — e.g., queue routing logic is explained both in the Stage 2 bullet list and in the tip box above it. The Constants section is thorough but verbose, explaining each flag's downstream routing in detail that could be tightened. However, most content is genuinely informative and not explaining things Claude already knows. | 2 / 3 |
Actionability | Every stage has concrete, copy-paste-ready invocation commands with exact argument syntax. Gate checkpoints include specific output formats. The pipeline is fully executable with clear command templates, parameter passing, and expected outputs at each stage. | 3 / 3 |
Workflow Clarity | The 5-stage pipeline is clearly sequenced with explicit gates (Gate 1 human checkpoint, Gate 2 writing checkpoint), validation steps (code review before GPU deployment, sanity check before full experiments, auto-review loop with score thresholds), and error recovery (fail gracefully, auto-debug up to 3 attempts, /codex:rescue fallback). The feedback loop in Stage 3 is well-defined with clear termination conditions. | 3 / 3 |
Progressive Disclosure | The skill references several sub-skills (/idea-discovery, /experiment-bridge, /auto-review-loop, /paper-writing) and shared protocols (output-versioning.md, output-manifest.md, output-language.md), which is good structure. However, no bundle files are provided to verify these references exist, and the SKILL.md itself is quite long — the Constants section and detailed stage descriptions could potentially be split into referenced files. The inline detail for each stage's internals (e.g., the 7-step breakdown of what /experiment-bridge does) duplicates what those sub-skills presumably already document. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
a425a71
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.