CtrlK
BlogDocsLog inGet started
Tessl Logo

paper-claim-audit

Zero-context verification that every number, comparison, and scope claim in the paper matches raw result files. Uses a fresh cross-model reviewer with NO prior context to prevent confirmation bias. Use when user says "审查论文数据", "check paper claims", "verify numbers", "论文数字核对", or before submission to ensure paper-to-evidence fidelity.

64

Quality

77%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/paper-claim-audit/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates a specific, well-defined capability (verifying paper claims against raw data), explains the unique methodology (fresh cross-model reviewer to prevent confirmation bias), and provides explicit bilingual trigger terms. It scores highly across all dimensions with strong specificity, natural trigger coverage, completeness, and distinctiveness.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: verifying numbers, comparisons, and scope claims against raw result files. Also specifies the mechanism (fresh cross-model reviewer with no prior context) and the purpose (prevent confirmation bias, ensure paper-to-evidence fidelity).

3 / 3

Completeness

Clearly answers both 'what' (zero-context verification of numbers, comparisons, and scope claims against raw result files using a fresh cross-model reviewer) and 'when' (explicit 'Use when' clause with specific trigger phrases and the scenario 'before submission').

3 / 3

Trigger Term Quality

Includes strong natural trigger terms in both English and Chinese: '审查论文数据', 'check paper claims', 'verify numbers', '论文数字核对', and 'before submission'. These cover natural phrases users would actually say when needing this skill.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche: paper data verification against raw evidence files with a cross-model bias-prevention approach. The bilingual trigger terms and specific focus on numerical claim verification make it very unlikely to conflict with general writing, editing, or code review skills.

3 / 3

Total

12

/

12

Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill provides excellent actionability with concrete tools, schemas, and a well-structured workflow, but is severely undermined by verbosity and poor progressive disclosure. It repeats key rules (fresh thread, zero context) multiple times, explains concepts Claude already understands (confirmation bias, why fresh context matters), and inlines extensive schema and protocol details that should be in separate reference files. The content could likely be cut by 50-60% without losing any actionable information.

Suggestions

Remove the 'Why This Exists' section and the comparison table — Claude doesn't need motivation or skill differentiation explanations. A single sentence like 'Fresh zero-context cross-model audit to catch executor confirmation bias' suffices.

Move the full JSON schema, verdict decision table, and path convention details into a referenced file (e.g., PAPER_CLAIM_AUDIT_SCHEMA.md) and keep only a brief summary inline.

Consolidate the repeated 'fresh thread' / 'zero context' / 'never codex-reply' rules into a single 'Key Rules' section instead of restating them in Core Principle, Key Rules, Thread Independence, and the reviewer prompt.

Move the detailed reviewer prompt template to a separate file (e.g., REVIEWER_PROMPT.md) since it's ~40 lines of content that only needs to be referenced, not read every time the skill is loaded.

DimensionReasoningScore

Conciseness

The skill is extremely verbose at ~300+ lines. It explains why confirmation bias exists (Claude knows this), includes a comparison table with other skills, repeats the 'fresh thread' rule 4+ times across different sections, and has extensive schema documentation that could be in a referenced file. The 'Why This Exists' section and failure mode explanations are largely unnecessary for Claude.

1 / 3

Actionability

The skill provides highly concrete, executable guidance: specific MCP tool calls with parameters, exact file paths to collect and exclude, a complete JSON schema for the output artifact, a detailed prompt template for the reviewer, and specific examples of failure modes with numeric thresholds (e.g., 84.7% → 85.3% is NOT OK).

3 / 3

Workflow Clarity

The 4-step workflow is clearly sequenced (Collect → Audit → Report → Summary) with explicit validation built into the audit protocol. The verdict decision table provides clear branching logic. Error recovery is addressed (non-blocking HTML render, ERROR verdict for failed reviewer calls). The feedback loop with /auto-paper-improvement-loop is well-defined.

3 / 3

Progressive Disclosure

Despite being extremely long, the skill is a monolithic wall of text with no bundle files to offload content to. The full JSON schema, the complete reviewer prompt, the verdict decision table, the path convention details, and the audit protocol could all be in referenced files. References to shared-references/ files exist but no bundle files are provided, and the inline content is not appropriately split.

1 / 3

Total

8

/

12

Passed

Validation

81%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation9 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

9

/

11

Passed

Repository
wanshuiyin/Auto-claude-code-research-in-sleep
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.