Systematically evaluate scholarly work using the ScholarEval framework, providing structured assessment across research quality dimensions including problem formulation, methodology, analysis, and writing with quantitative scoring and actionable feedback.
56
35%
Does it follow best practices?
Impact
92%
1.67xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/scholar-evaluation/SKILL.mdQuality
Discovery
57%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description identifies a clear and distinctive niche (scholarly work evaluation with a named framework) and lists assessment dimensions, but lacks an explicit 'Use when...' clause and misses common natural language trigger terms users would employ when seeking paper reviews or academic feedback. The specificity of actions could be improved by listing more concrete outputs or steps rather than abstract quality dimensions.
Suggestions
Add an explicit 'Use when...' clause with natural trigger terms like 'review my paper', 'evaluate my research', 'academic paper feedback', 'peer review', 'grade my thesis', or 'critique my manuscript'.
Include common file types or formats users might mention, such as 'research papers', 'journal articles', 'dissertations', 'conference submissions', or 'academic essays'.
Make actions more concrete by specifying outputs, e.g., 'Generates rubric-based scores, identifies methodological weaknesses, and provides revision recommendations for academic papers.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (scholarly work evaluation) and some actions (structured assessment, quantitative scoring, actionable feedback), but the specific actions are somewhat abstract. Terms like 'problem formulation, methodology, analysis, and writing' describe assessment dimensions rather than concrete actions the skill performs. | 2 / 3 |
Completeness | The 'what' is reasonably covered (evaluate scholarly work with structured assessment and scoring), but there is no explicit 'Use when...' clause or equivalent trigger guidance. The 'when' is only implied by the nature of the task, which per the rubric caps completeness at 2. | 2 / 3 |
Trigger Term Quality | Includes some relevant terms like 'scholarly work', 'research quality', 'methodology', and 'scoring', but misses common natural language variations users might say such as 'review my paper', 'grade my essay', 'academic paper feedback', 'peer review', 'thesis evaluation', or 'research paper critique'. | 2 / 3 |
Distinctiveness Conflict Risk | The description carves out a clear niche: scholarly/academic work evaluation using a specific named framework (ScholarEval) with quantitative scoring across defined dimensions. This is unlikely to conflict with general writing feedback or other review skills due to its academic focus and framework specificity. | 3 / 3 |
Total | 9 / 12 Passed |
Implementation
12%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill is excessively verbose and lacks concrete, actionable content despite its length. It reads more like a general description of how to evaluate scholarly work rather than providing specific, executable guidance that Claude couldn't already infer. The referenced supporting files don't exist in the bundle, making key parts of the skill non-functional, and the large schematics section is an irrelevant tangent promoting another skill.
Suggestions
Remove the entire 'Visual Enhancement with Scientific Schematics' section—it's a promotion for another skill and adds no evaluation-specific value.
Replace the generic dimension descriptions with concrete rubric examples showing what a score of 1 vs 5 looks like for each dimension, ideally with brief sample text excerpts.
Either include the referenced `references/evaluation_framework.md` and `scripts/calculate_scores.py` as bundle files, or inline the essential content from them—dead references make the skill non-functional.
Cut the 'When to Use This Skill' and 'Best Practices' sections entirely, and reduce the 'Notes' section to a single line—these contain information Claude already knows or can infer.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose with significant padding. The 'When to Use This Skill' section lists 9 bullet points of obvious use cases. The 'Visual Enhancement with Scientific Schematics' section is a lengthy tangent about a different skill. The 'Best Practices' section states things Claude already knows (be objective, be constructive, provide evidence). The entire document could be reduced to ~30% of its length without losing actionable content. | 1 / 3 |
Actionability | Despite its length, the skill provides almost no concrete, executable guidance. The evaluation dimensions are generic descriptions (e.g., 'Clarity and specificity of research questions') rather than specific assessment criteria or rubrics. The scoring scale is a basic 1-5 without calibration examples. References to `references/evaluation_framework.md` and `scripts/calculate_scores.py` are made but no bundle files exist, making the skill hollow. | 1 / 3 |
Workflow Clarity | The 6-step workflow is clearly sequenced and logically ordered, which is good. However, there are no validation checkpoints or feedback loops—no step says 'verify your assessment against X before proceeding.' The example workflow at the end is a narrative description rather than a concrete demonstration with actual input/output, reducing its utility. | 2 / 3 |
Progressive Disclosure | The skill references `references/evaluation_framework.md` and `scripts/calculate_scores.py` but no bundle files exist, making these dead references. The main file is a monolithic wall of text (~250+ lines) with content that should be split out (e.g., the entire schematics section, the detailed dimension descriptions, the integration section). The document tries to be both overview and detailed guide, failing at progressive disclosure. | 1 / 3 |
Total | 5 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
cbcae7b
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.