CtrlK
BlogDocsLog inGet started
Tessl Logo

proof-checker

Rigorous mathematical proof verification and fixing workflow. Reads a LaTeX proof, identifies gaps via cross-model review (external reviewer backend, xhigh reasoning), fixes each gap with full derivations, re-reviews, and generates an audit report. Use when user says "检查证明", "verify proof", "proof check", "审证明", "check this proof", or wants rigorous mathematical verification of a theory paper.

63

Quality

77%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/proof-checker/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates a specific, multi-step workflow for mathematical proof verification. It provides strong trigger terms in both English and Chinese, explicitly states when to use the skill, and occupies a very distinct niche that minimizes conflict risk with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: reads a LaTeX proof, identifies gaps via cross-model review, fixes each gap with full derivations, re-reviews, and generates an audit report. Very detailed workflow description.

3 / 3

Completeness

Clearly answers both 'what' (reads LaTeX proof, identifies gaps, fixes with derivations, re-reviews, generates audit report) and 'when' (explicit 'Use when' clause with specific trigger phrases and use case description).

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms in both English and Chinese: '检查证明', 'verify proof', 'proof check', '审证明', 'check this proof', plus conceptual triggers like 'rigorous mathematical verification' and 'theory paper'.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche: mathematical proof verification with LaTeX, cross-model review, and audit reports. The combination of domain (math proofs), format (LaTeX), and workflow (verify-fix-re-review-audit) makes it very unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill is extraordinarily thorough and actionable — it provides a complete, rigorous workflow for mathematical proof verification with precise tool calls, validation gates, and error recovery paths. However, it is massively over-length for a SKILL.md body, embedding detailed taxonomies, JSON schemas, opt-in feature specifications, and failure mode documentation that should be split into supporting reference files. The repeated explanations of opt-in flag discipline and failure modes across multiple sections significantly inflate token cost.

Suggestions

Extract the 20-category issue taxonomy, two-axis severity system, and side-condition checklists into a separate reference file (e.g., PROOF_TAXONOMY.md) and link to it from the main skill.

Move the detailed JSON schema definitions (PROOF_AUDIT.json, deep_fix_plans, restatement_drift) into a separate SCHEMAS.md or reference the shared-references/assurance-contract.md more concisely instead of reproducing full schemas inline.

Consolidate the deep-fix and restatement-check opt-in documentation — the rationale, failure modes, and field semantics are each explained 2-3 times across different sections; a single 'Opt-in Features' reference file would eliminate ~40% of the document.

Remove explanatory prose that restates what the structured content already conveys (e.g., 'Why opt-in' section, 'What this phase does NOT do' sections, repeated 'Downstream consumers MUST treat...' statements that appear verbatim in multiple places).

DimensionReasoningScore

Conciseness

This skill is extremely verbose — well over 1000 lines with exhaustive taxonomy tables, JSON schema definitions, failure mode prose, opt-in flag discipline sections, and repeated explanations of the same concepts (e.g., deep-fix opt-in rationale is explained at least 3 times). Much of this content (e.g., the 20-category issue taxonomy, side-condition checklists for common theorems, detailed JSON schemas) could be split into reference files. The skill also over-explains concepts Claude already knows (what DCT/MCT/Fubini are, what counterexamples are).

1 / 3

Actionability

The skill provides highly concrete, executable guidance: exact MCP tool names and calling conventions, specific prompt templates to send to reviewers, exact bash commands for compilation, complete JSON schemas for output artifacts, structured fix recording templates, and precise algorithms for each phase. The guidance is copy-paste ready throughout.

3 / 3

Workflow Clarity

The multi-phase workflow (Phase 0 → 0.5 → 1 → 1.5 → 2 → 3 → 3.5 → 3.6 → 3.9 → 4 → 5) is clearly sequenced with explicit validation checkpoints (acceptance gate, compile checks after fixes, blind re-review for FATAL/CRITICAL fixes, regression proof-audit). Feedback loops are well-defined: fix → re-review → repeat up to MAX_REVIEW_ROUNDS, with an explicit unrecoverable protocol (Phase 3.9) when convergence fails.

3 / 3

Progressive Disclosure

This is a monolithic wall of text with no bundle files to offload content to. The issue taxonomy (20 categories), side-condition checklists, JSON schemas, deep-fix specifications, restatement-check algorithm, and verdict decision tables are all inline in a single massive file. References to external files like `shared-references/reviewer-routing.md` and `shared-references/assurance-contract.md` exist but no bundle files are provided, and the core problem is that content that should be in separate reference files (taxonomy, schemas, checklists) is all embedded.

1 / 3

Total

8

/

12

Passed

Validation

72%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation8 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (742 lines); consider splitting into references/ and linking

Warning

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

8

/

11

Passed

Repository
wanshuiyin/Auto-claude-code-research-in-sleep
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.