Research toolkit for triaging academic papers and GitHub projects. Triage papers and tools, reproduce benchmark claims, search Google Scholar, Semantic Scholar, PubMed, or Sci-Hub, and extract structured data from scientific PDFs.
92
92%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Critical
Do not install without reviewing
Add a new academic paper to the research repo as a structured reference summary.
2310.08560), arxiv URL, or paper PDF pathREVIEWED.md first)triage-paper first, then promote to ANALYSIS-*.mdWhen available, prefer these MCPs over WebFetch for paper discovery and metadata resolution — they return structured data and avoid HTML scraping.
{
"mcpServers": {
"semantic-scholar": {
"type": "stdio",
"command": "uvx",
"args": ["semantic-scholar-fastmcp"]
},
"google-scholar": {
"type": "stdio",
"command": "uvx",
"args": ["google_scholar_mcp_server"]
}
}
}Use semantic-scholar as the primary source (open, structured, covers most CS/ML papers). Fall back to google-scholar for papers not indexed there. Fall back to WebFetch (arxiv abstract page) only when neither MCP is configured or returns results.
Triage is a quality gate, not a data-entry task. The goal is a scannable, honest record.
REVIEWED.md; promotion to ANALYSIS-*.md requires a deliberate user decision — never automatic.semantic-scholar MCP to resolve metadata (title, authors, date, abstract, DOI). If not configured, fall back to WebFetch on the arxiv abstract page.semantic-scholar or google-scholar MCP over a raw HTTP fetch.<firstauthor-surname>-<2-3-word-topic> (e.g. jiang-llmlingua, press-longchat).REVIEWED.md and references/REFERENCE_INDEX.md for the slug or arxiv ID.Assign one or more tags appropriate to the research domain. Universal tags:
| Tag | Meaning |
|---|---|
survey | Overview paper covering the topic broadly |
benchmark | Evaluation dataset or framework |
empirical | Study with experimental evaluation |
theoretical | Formal or theoretical contribution |
system | System or implementation paper |
Read assets/templates/REFERENCE-paper.yaml to get the required frontmatter fields and section structure. Create references/<slug>.md with a YAML frontmatter block (all required_fields from the template) followed by the required sections.
Populate every section:
N/A).(as reported).Keep language precise. Do not pad. Quote all numbers with their source.
Add a row to the summary table at the top (reverse-chronological):
| <today's date> | <slug> | paper | pending | <one-line description> |Add a detailed section below the table:
## <slug> — <Full paper title>
- **arxiv**: <ID>
- **Authors**: <list>
- **Date**: <YYYY-MM-DD>
- **Tags**: <tags>
- **Summary**: <2–3 sentences>
- **Disposition**: pending — awaiting user decision on promotionAdd a row under the most relevant category table. If no category fits, add it to the closest one and note it for the user.
Summarise what was created. Then ask:
This paper is now in
REVIEWED.mdas pending. Would you like to:
- Promote it to a standalone
ANALYSIS-arxiv-<id>-<slug>.mddeep dive?- Keep it in REVIEWED.md for now?
- Skip it (mark as not promoted with reasoning)?
Read assets/templates/ANALYSIS-paper.yaml to get the required frontmatter fields and section structure. Create analysis/ANALYSIS-arxiv-<id>-<slug>.md with a YAML frontmatter block (all required_fields from the template) followed by the required stage sections. Update the disposition in REVIEWED.md from pending to analysis.
# Check for duplicate before starting
grep -i "<arxiv-id-or-slug>" REVIEWED.md references/REFERENCE_INDEX.md
# Fetch arxiv abstract (title, authors, date)
curl -s "https://arxiv.org/abs/<id>"
# Create reference file from template
cp templates/REFERENCE-paper.md references/<slug>.md
# Validate the completed file
./scripts/validate-reference-paper.sh references/<slug>.md
./scripts/validate-analysis-paper.sh ANALYSIS-arxiv-<id>-<slug>.md
# | YYYY-MM-DD | <slug> | paper | pending | <one-line description> |WHY: Files without frontmatter fail schema validation and break indexing tools that rely on structured metadata.
BAD Start the file with # ANALYSIS: <slug> followed by bold-text fields. → GOOD Open with --- YAML frontmatter block containing all required fields before any prose.
WHY: Fabricated metrics corrupt the research record.
BAD "Achieves 87% recall on LongMemEval." → GOOD "Reports 87% recall on LongMemEval (as reported, Table 3)."
WHY: Re-triaging the same paper wastes effort and creates conflicting entries.
BAD Create new file without checking REVIEWED.md. → GOOD Run grep -i "<slug>" REVIEWED.md references/REFERENCE_INDEX.md first.
WHY: Promotion to ANALYSIS-*.md is a quality gate, not automatic.
BAD Create ANALYSIS-*.md as part of triage. → GOOD Triage to REVIEWED.md, then ask the user.
google-scholar-search
pubmed-search
reproduce-benchmark
sci-data-extractor
sci-hub-search
semantic-scholar-search
triage-paper
triage-tool