Turn a vague research direction into a problem-anchored, elegant, frontier-aware, implementation-oriented method plan via iterative GPT-5.4 review. Use when the user says "refine my approach", "帮我细化方案", "decompose this problem", "打磨idea", "refine research plan", "细化研究方案", or wants a concrete research method that stays simple, focused, and top-venue ready instead of a vague or overbuilt idea.
76
72%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/research-refine/SKILL.mdQuality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description with excellent trigger term coverage (bilingual), clear completeness with both 'what' and 'when' clauses, and a distinctive niche. The main weakness is that the specificity of concrete actions is somewhat obscured by a chain of adjectives ('problem-anchored, elegant, frontier-aware, implementation-oriented') rather than listing discrete capabilities. The mention of 'GPT-5.4 review' is an unusual and potentially confusing detail.
Suggestions
Replace the adjective chain with 2-3 concrete actions, e.g., 'decomposes research problems, identifies methodological gaps, drafts implementation-ready method plans' to improve specificity.
Clarify or remove the 'GPT-5.4 review' reference, which may confuse skill selection or seem like an over-claim about the underlying process.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names the domain (research planning) and describes the general action ('Turn a vague research direction into a problem-anchored, elegant, frontier-aware, implementation-oriented method plan via iterative GPT-5.4 review'), but the specific concrete actions are somewhat buried in adjectives rather than listing distinct steps like 'decompose problems, identify gaps, draft methodology sections'. | 2 / 3 |
Completeness | Clearly answers both 'what' (turn vague research direction into a concrete, implementation-oriented method plan via iterative review) and 'when' (explicit 'Use when...' clause with multiple trigger phrases and a description of the situation: 'wants a concrete research method that stays simple, focused, and top-venue ready instead of a vague or overbuilt idea'). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms in both English and Chinese: 'refine my approach', '帮我细化方案', 'decompose this problem', '打磨idea', 'refine research plan', '细化研究方案'. These are phrases users would naturally say, and the bilingual coverage is a strength. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche: research method refinement with specific qualities (frontier-aware, top-venue ready, iterative GPT-5.4 review). The bilingual triggers and specific research planning focus make it unlikely to conflict with general coding, writing, or other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is remarkably thorough and actionable with excellent workflow clarity, checkpoint recovery, and concrete MCP tool calls. However, it is severely over-long — embedding every template, prompt, and schema inline creates a massive document that wastes context window tokens. The content would score much higher if templates and reviewer prompts were extracted into referenced files, leaving the main skill as a concise orchestration guide.
Suggestions
Extract the proposal template (Step 1.6), reviewer prompt (Phase 2), refinement template (Phase 3), and all final report templates (Phase 5) into separate referenced files (e.g., `templates/proposal-template.md`, `prompts/reviewer-prompt.md`) to dramatically reduce inline length.
Consolidate the repeated principles — 'smallest adequate mechanism', 'one paper one contribution', 'anchor first' appear in the overview, multiple phase descriptions, and Key Rules. State them once in a principles section and reference them.
Move the checkpoint recovery logic and state schema into a separate `CHECKPOINT.md` reference file, keeping only a brief summary inline.
Remove explanatory rationale that Claude already understands (e.g., 'Experiments exist to validate the method, not to dominate the document', 'A long module list is not novelty') — these are judgment calls Claude can make without being told.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This skill is extremely verbose at ~500+ lines. It over-explains every phase, includes extensive template structures that could be referenced externally, repeats the same principles multiple times (e.g., 'smallest adequate mechanism' appears in principles, Step 1.2, Step 1.3, Step 3.2, and Key Rules), and includes lengthy reviewer prompt templates inline. Much of this content (like the full proposal template, the full reviewer prompt, the checkpoint state schema) could be in referenced files. | 1 / 3 |
Actionability | The skill provides highly concrete, executable guidance: exact MCP tool calls with specific parameters, complete markdown templates for every output file, specific JSON schemas for state persistence, exact reviewer prompts with scoring dimensions and weights, and clear stop conditions. An implementer could follow this step-by-step. | 3 / 3 |
Workflow Clarity | The multi-phase workflow is exceptionally well-sequenced with explicit checkpoints after every phase, clear stop conditions (score >= 9, MAX_ROUNDS), checkpoint recovery logic with a detailed resume table, feedback loops (Phase 3-4 iterate until threshold), and validation at each step (anchor check, simplicity check, drift warnings). The initialization section handles edge cases like stale checkpoints. | 3 / 3 |
Progressive Disclosure | The skill is a monolithic wall of text with everything inline. The full proposal template (~40 lines), the full reviewer prompt (~50 lines), the refinement template, the final report template, the review summary template, and the refinement report template are all embedded directly. These should be in referenced files. The only external references are to shared protocols at the end. The skill would benefit enormously from splitting templates and prompts into separate files. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (743 lines); consider splitting into references/ and linking | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
Total | 9 / 11 Passed | |
700fbe2
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.