CtrlK
BlogDocsLog inGet started
Tessl Logo

proof-failure-explainer

Analyze and explain why Isabelle or Coq proofs fail, identifying the root cause such as type mismatches, missing assumptions, incorrect goals, unification failures, or inapplicable tactics. Use when the user encounters proof failures, error messages in formal verification, stuck proof states, or asks why their Isabelle/Coq proof doesn't work.

Install with Tessl CLI

npx tessl i github:ArabelaTso/Skills-4-SE --skill proof-failure-explainer
What are skills?

94

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Evaluation results

86%

4%

Coq Proof Failure Analysis Report

Type error and unification failure diagnosis

Criteria
Without context
With context

Proof 1 category

100%

100%

Proof 2 category

50%

100%

Proof 3 category

100%

100%

Type mismatch explanation

100%

100%

Unification term comparison

100%

100%

'Fails because' phrasing

0%

0%

Expected vs actual mismatch

100%

100%

Corrected code examples

100%

100%

Fix explanation

100%

100%

reflexivity limitation

100%

100%

Existential instantiation fix

100%

100%

Debugging tool recommendation

0%

0%

Alternative approach

100%

100%

Without context: $0.2867 · 1m 16s · 14 turns · 20 in / 4,501 out tokens

With context: $0.3574 · 1m 23s · 11 turns · 14 in / 4,648 out tokens

83%

-3%

Isabelle/Coq Proof Debugging Consultation

Tactic failure and induction problem diagnosis

Criteria
Without context
With context

Tactic failure category

100%

100%

Induction failure category

100%

100%

Conjunction vs disjunction distinction

100%

100%

Correct tactic for disjunction

100%

100%

Wrong induction variable explanation

100%

100%

IH weakness explanation

87%

75%

'Fails because' phrasing

37%

75%

Tactic goal structure mismatch

100%

100%

Working corrected proofs

100%

100%

Fix explanation

100%

100%

Isabelle debugging tool

0%

0%

Induction diagnostic question

100%

37%

Without context: $0.2151 · 1m 45s · 8 turns · 13 in / 5,120 out tokens

With context: $0.7985 · 3m 44s · 21 turns · 5,898 in / 12,403 out tokens

96%

1%

Formal Verification Study Group: Proof Review

Missing assumption and incorrect goal diagnosis

Criteria
Without context
With context

Proof 1 category

100%

100%

Proof 2 category

55%

55%

Proof 3 category

100%

100%

Proof 4 category

100%

100%

Proof 5 category

100%

100%

Counterexample for Proof 3

100%

100%

Counterexample for Proof 4 or 5

100%

100%

Missing precondition for Proof 2

100%

100%

Correct tactic for Proof 1

100%

100%

'Fails because' phrasing

85%

100%

Goal correctness verification

100%

100%

Fixed goal for incorrect statements

100%

100%

Without context: $0.3725 · 2m 9s · 19 turns · 25 in / 5,746 out tokens

With context: $0.5329 · 2m 31s · 20 turns · 138 in / 8,489 out tokens

Evaluated
Agent
Claude Code

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.