proof-failure-explainer

Analyze and explain why Isabelle or Coq proofs fail, identifying the root cause such as type mismatches, missing assumptions, incorrect goals, unification failures, or inapplicable tactics. Use when the user encounters proof failures, error messages in formal verification, stuck proof states, or asks why their Isabelle/Coq proof doesn't work.

1.01x

Quality

92%

Does it follow best practices?

Impact

88%

1.01x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Content

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong skill with excellent actionability through concrete, executable examples covering multiple failure types in both Isabelle and Coq. The workflow is well-structured and the progressive disclosure is appropriate. The main weakness is some verbosity in sections like 'Best Practices' and 'Diagnostic Questions' that restate obvious guidance or duplicate the workflow content.

Suggestions

Remove or significantly condense the 'Best Practices' section - items like 'Read error messages carefully' and 'Check types first' are obvious to Claude and already covered in the workflow

Consolidate 'Diagnostic Questions' into the Step 3 analysis section rather than repeating similar content separately

Dimension	Reasoning	Score
Conciseness	The skill is reasonably efficient but includes some unnecessary verbosity, such as the extensive 'Best Practices' section with generic advice Claude already knows (e.g., 'Read error messages carefully') and the 'Diagnostic Questions' section that largely restates the analysis workflow.	2 / 3
Actionability	Provides fully executable code examples for both Coq and Isabelle, with concrete failing proofs, exact error messages, clear explanations, and working solutions. The examples are copy-paste ready and demonstrate real failure scenarios.	3 / 3
Workflow Clarity	The 5-step analysis workflow is clearly sequenced (Gather Context → Identify Category → Analyze Root Cause → Explain → Suggest Solutions) with explicit checkpoints. The quick diagnosis table provides efficient triage for common issues.	3 / 3
Progressive Disclosure	Well-structured with clear overview, detailed examples inline where needed, and appropriate references to external files (failure_patterns.md) and documentation. Navigation is clear with a quick diagnosis table for fast lookup.	3 / 3
	Total	11 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines a specific niche (formal verification proof debugging in Isabelle/Coq), lists concrete failure types it can diagnose, and provides explicit trigger conditions. The description uses proper third-person voice and includes both technical terms experts would use and natural language phrases beginners might say.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Analyze and explain why proofs fail, identifying the root cause' with explicit examples including 'type mismatches, missing assumptions, incorrect goals, unification failures, or inapplicable tactics.'	3 / 3
Completeness	Clearly answers both what ('Analyze and explain why Isabelle or Coq proofs fail, identifying root causes...') AND when ('Use when the user encounters proof failures, error messages in formal verification, stuck proof states, or asks why their Isabelle/Coq proof doesn't work').	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'Isabelle', 'Coq', 'proof fails', 'error messages', 'formal verification', 'stuck proof states', 'proof doesn't work' - these match how users naturally describe their problems.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche targeting specifically Isabelle and Coq formal verification systems with domain-specific terminology like 'unification failures', 'tactics', and 'proof states' - unlikely to conflict with general debugging or other programming skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: ArabelaTso/Skills-4-SE
Commit: 0f00a4f

Reviewed: 4 months ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.