Rigorous mathematical proof verification and fixing workflow. Reads a LaTeX proof, identifies gaps via cross-model review (Codex GPT-5.4 xhigh), fixes each gap with full derivations, re-reviews, and generates an audit report. Use when user says "检查证明", "verify proof", "proof check", "审证明", "check this proof", or wants rigorous mathematical verification of a theory paper.
79
77%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/proof-checker/SKILL.mdQuality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly articulates a specific workflow for mathematical proof verification, includes explicit trigger terms in both English and Chinese, and occupies a distinct niche. The description effectively communicates both what the skill does (multi-step proof verification and fixing) and when to use it (with natural trigger phrases). Minor note: the reference to 'Codex GPT-5.4 xhigh' is an implementation detail that could be confusing but doesn't significantly detract from the description's quality.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: reads a LaTeX proof, identifies gaps via cross-model review, fixes each gap with full derivations, re-reviews, and generates an audit report. This is a detailed workflow description with clear steps. | 3 / 3 |
Completeness | Clearly answers both 'what' (reads LaTeX proof, identifies gaps, fixes with derivations, re-reviews, generates audit report) and 'when' (explicit 'Use when...' clause with specific trigger phrases and a general use case description). | 3 / 3 |
Trigger Term Quality | Includes excellent natural trigger terms in both English and Chinese: '检查证明', 'verify proof', 'proof check', '审证明', 'check this proof', plus contextual triggers like 'rigorous mathematical verification of a theory paper'. Good multilingual coverage of terms users would naturally say. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche: mathematical proof verification in LaTeX with a specific cross-model review workflow. The combination of LaTeX proofs, mathematical rigor, audit reports, and bilingual Chinese/English triggers makes it very unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill demonstrates exceptional workflow design and actionability — the multi-phase verification process with adversarial review, counterexample red-teaming, blind re-review, and formal acceptance gates is thorough and well-structured. However, it is severely undermined by its extreme length and monolithic structure: the entire skill is a single massive document that inlines taxonomy tables, JSON schemas, opt-in mode specifications, and failure-mode documentation that should be split across reference files. The token cost is very high, with significant portions covering mathematical knowledge Claude already possesses.
Suggestions
Extract the 20-category issue taxonomy, two-axis severity system, and side-condition checklists into a separate reference file (e.g., TAXONOMY.md) and link to it from the main skill.
Move the detailed JSON schemas for PROOF_AUDIT.json (including deep_fix_plans and restatement_drift optional fields) into a separate SCHEMA.md or the referenced shared-references/assurance-contract.md.
Move the deep-fix and restatement-check opt-in specifications (Phase 3.6 algorithm, failure modes, field semantics) into separate reference files since they are opt-in features that most invocations won't use.
Remove or drastically compress the side-condition checklists for common theorems (DCT, MCT, Fubini, etc.) — Claude already knows these conditions and a brief reminder ('verify all side-conditions for cited theorems') would suffice.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | This skill is extremely verbose — well over 500 lines with exhaustive taxonomy tables, detailed JSON schemas, multiple opt-in mode specifications, failure mode documentation, and extensive side-condition checklists. Much of this (e.g., the 20-category issue taxonomy, the side-condition checklists for DCT/MCT/Fubini, the two-axis severity system) is knowledge Claude already possesses. The deep-fix and restatement-check opt-in sections alone consume hundreds of tokens on schema details and failure-mode edge cases that could be in separate reference files. | 1 / 3 |
Actionability | The skill provides highly concrete, executable guidance: specific MCP tool invocations with exact prompt templates, bash commands for LaTeX compilation, structured JSON schemas for output artifacts, explicit fix-recording templates, and detailed per-phase instructions. The reviewer prompt is copy-paste ready, and output formats are fully specified with examples. | 3 / 3 |
Workflow Clarity | The multi-phase workflow (0 → 0.5 → 1 → 1.5 → 2 → 3 → 3.5 → 3.6 → 3.9 → 4 → 5) is clearly sequenced with explicit validation checkpoints: acceptance gates, compile checks after fixes, blind re-review for FATAL/CRITICAL fixes, regression proof-audit, and a clear unrecoverable protocol (Phase 3.9) with feedback loops (repeat Phases 2-3 up to MAX_REVIEW_ROUNDS). The workflow handles error recovery thoroughly. | 3 / 3 |
Progressive Disclosure | This is a monolithic wall of text with everything inlined into a single massive SKILL.md. The 20-category taxonomy, side-condition checklists, detailed JSON schemas, deep-fix specifications, restatement-check algorithm, and submission artifact schemas should all be in separate reference files. There is one reference to `shared-references/reviewer-routing.md` and `shared-references/assurance-contract.md`, but no bundle files are provided, and the vast majority of content that should be split out remains inline. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
72%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 8 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (711 lines); consider splitting into references/ and linking | Warning |
allowed_tools_field | 'allowed-tools' contains unusual tool name(s) | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 8 / 11 Passed | |
2028ac4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.