Comprehensive citation management for academic research. Search Google Scholar and PubMed for papers, extract accurate metadata, validate citations, and generate properly formatted BibTeX entries. This skill should be used when you need to find papers, verify citation information, convert DOIs to BibTeX, or ensure reference accuracy in scientific writing.
77
73%
Does it follow best practices?
Impact
81%
1.84xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/citation-management/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly communicates its purpose, lists concrete capabilities, and provides explicit trigger guidance. It covers natural user terms well and occupies a distinct niche. The only minor note is the use of second person ('you need to') in the trigger clause, but the description uses third person for the capability statements, so the impact is minimal.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: search Google Scholar and PubMed, extract metadata, validate citations, generate BibTeX entries. These are clear, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both what (search databases, extract metadata, validate citations, generate BibTeX) and when ('when you need to find papers, verify citation information, convert DOIs to BibTeX, or ensure reference accuracy'). Explicit trigger guidance is present. | 3 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'Google Scholar', 'PubMed', 'papers', 'citations', 'BibTeX', 'DOIs', 'reference', 'scientific writing', 'academic research'. Good coverage of terms across the domain. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche combining academic citation management with specific databases (Google Scholar, PubMed) and output formats (BibTeX). Unlikely to conflict with other skills due to the specialized domain and specific trigger terms. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
47%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill has a well-structured workflow with clear phases and validation checkpoints, and provides concrete CLI examples. However, it is extremely bloated—easily 3-4x longer than necessary—with massive redundancy (scripts documented multiple times), irrelevant sections (scientific schematics), and explanatory content Claude doesn't need (venue quality tiers, what MeSH terms are). The content that should be in referenced sub-files is instead inlined, defeating the progressive disclosure pattern.
Suggestions
Cut the content by 60-70%: remove the entire 'Visual Enhancement with Scientific Schematics' section, eliminate the redundant 'Tools and Scripts' section (already covered in the workflow), and trim the 'Best Practices' and 'Common Pitfalls' sections to bullet points only.
Move detailed search strategy content (Google Scholar operators, PubMed field tags, MeSH guidance, venue quality tiers) into the referenced files (references/google_scholar_search.md, references/pubmed_search.md) instead of inlining them.
Remove explanatory content Claude already knows: what DOIs are, what MeSH terms are, what E-utilities does, what BibTeX entry types mean. Keep only the specific commands and formats needed.
Consolidate the example workflows: keep one complete end-to-end example and remove the other three, which are subsets of the same workflow.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely verbose at ~600+ lines. Massive redundancy: every script's usage is documented 2-3 times (once in the workflow, once in the Tools section, once in examples). The 'Visual Enhancement with Scientific Schematics' section is entirely irrelevant to citation management. Tables of citation count thresholds, venue quality tiers, and author reputation indicators are unnecessary context Claude already knows. Best practices sections repeat what the workflow already covers. | 1 / 3 |
Actionability | Provides concrete CLI commands and BibTeX examples that appear executable, but all scripts referenced (search_google_scholar.py, validate_citations.py, etc.) are assumed to exist without any indication they're real or bundled. The commands look copy-paste ready but may not actually work since the scripts are hypothetical. The BibTeX format examples are genuinely useful and concrete. | 2 / 3 |
Workflow Clarity | The 5-phase workflow (Discovery → Metadata Extraction → BibTeX Formatting → Validation → Integration) is clearly sequenced with explicit validation steps. Phase 4 includes validation with auto-fix and error recovery. The end-to-end example workflows in the Examples section show complete pipelines with validation checkpoints before final output. | 3 / 3 |
Progressive Disclosure | References to external files like 'references/google_scholar_search.md' and 'references/bibtex_formatting.md' are well-signaled, but the SKILL.md itself is monolithic with enormous amounts of inline content that should be in those reference files. The Tools section duplicates the workflow section. Script documentation, search strategy details, and best practices could all be in separate referenced files. | 2 / 3 |
Total | 8 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (1114 lines); consider splitting into references/ and linking | Warning |
metadata_version | 'metadata.version' is missing | Warning |
Total | 9 / 11 Passed | |
b58ad7e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.