CtrlK
BlogDocsLog inGet started
Tessl Logo

counterexample-debugger

Debug proof failures using counterexamples from Nitpick (Isabelle) or QuickChick (Coq) to identify specification errors, missing preconditions, and proof strategy issues. Use when: (1) A proof attempt fails and you need to understand why, (2) Counterexamples are generated by Nitpick or QuickChick, (3) Specifications may be incorrect or incomplete, (4) Theorems need validation before proving, (5) Missing preconditions or lemmas need identification, or (6) Proof failures need explanation and correction suggestions. Supports both Isabelle/HOL and Coq equally.

Install with Tessl CLI

npx tessl i github:ArabelaTso/Skills-4-SE --skill counterexample-debugger
What are skills?

94

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Evaluation results

96%

7%

Debug Report: Isabelle/HOL Theorem Failure

Missing precondition identification

Criteria
Without context
With context

Manual verification

100%

100%

Empty list pattern

75%

100%

Undefined hd behavior

30%

80%

Root cause: missing precondition

100%

100%

Corrected theorem syntax

100%

100%

Impact assessment

100%

100%

Step-by-step explanation

100%

100%

Retest recommendation

100%

100%

No counterexample caveat

75%

75%

Completeness check

100%

100%

Root cause not symptom

100%

100%

Without context: $0.1825 · 1m 7s · 10 turns · 14 in / 3,876 out tokens

With context: $0.4430 · 1m 40s · 22 turns · 30 in / 5,115 out tokens

88%

9%

Debug Report: QuickChick Property Failure in Coq

Duplicate element pattern and comparison operator fix

Criteria
Without context
With context

Manual computation

100%

100%

Duplicate element pattern

100%

100%

Strict vs non-strict inequality

100%

100%

Correct root cause category

100%

100%

Corrected definition

100%

100%

Before/after code

100%

100%

Shrinking interpretation

0%

0%

Retest recommendation

100%

100%

Success does not mean proof

37%

50%

Completeness: permutation missing

0%

100%

Step-by-step violation

100%

100%

Without context: $0.2072 · 1m 9s · 9 turns · 14 in / 3,235 out tokens

With context: $0.5849 · 1m 51s · 24 turns · 3,847 in / 6,617 out tokens

97%

11%

Debug Report: Two Failing Isabelle/HOL Theorems

Quantifier order and incomplete specification

Criteria
Without context
With context

Quantifier root cause (A)

100%

100%

∃∀ vs ∀∃ logical difference (A)

80%

100%

Corrected quantifier theorem (A)

100%

100%

Witness for corrected theorem (A)

100%

100%

Theorem A is fundamentally wrong (A)

100%

100%

Manual trace for [1, 2, 0] (B)

100%

100%

sorted [1, 0, 2] is False (B)

100%

100%

Implementation bug root cause (B)

100%

100%

Incomplete spec note (B)

75%

100%

Strengthened spec (B)

62%

100%

Retest recommendation

62%

100%

No counterexample caveat

50%

62%

Without context: $0.3664 · 2m 23s · 11 turns · 14 in / 7,980 out tokens

With context: $0.5136 · 2m 12s · 18 turns · 3,122 in / 6,896 out tokens

Evaluated
Agent
Claude Code

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.