counterexample-debugger

Debug proof failures using counterexamples from Nitpick (Isabelle) or QuickChick (Coq) to identify specification errors, missing preconditions, and proof strategy issues. Use when: (1) A proof attempt fails and you need to understand why, (2) Counterexamples are generated by Nitpick or QuickChick, (3) Specifications may be incorrect or incomplete, (4) Theorems need validation before proving, (5) Missing preconditions or lemmas need identification, or (6) Proof failures need explanation and correction suggestions. Supports both Isabelle/HOL and Coq equally.

1.10x

Quality

92%

Does it follow best practices?

Impact

93%

1.10x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines a specialized niche in formal verification and theorem proving. It provides comprehensive coverage of capabilities, explicit trigger conditions, and domain-specific terminology that users would naturally employ. The description is well-structured, uses proper third-person voice, and would allow Claude to confidently select this skill when appropriate.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Debug proof failures', 'identify specification errors, missing preconditions, and proof strategy issues', 'validation before proving', 'identification of missing preconditions or lemmas', 'explanation and correction suggestions'.	3 / 3
Completeness	Clearly answers both what (debug proof failures, identify errors, explain and suggest corrections) AND when with explicit numbered trigger conditions covering six specific scenarios. The 'Use when:' clause is comprehensive and explicit.	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'proof failures', 'counterexamples', 'Nitpick', 'QuickChick', 'Isabelle', 'Coq', 'specification errors', 'missing preconditions', 'theorems', 'lemmas'. These are precisely the terms a formal verification user would use.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche targeting formal theorem proving with specific tools (Nitpick, QuickChick) and proof assistants (Isabelle/HOL, Coq). Very unlikely to conflict with other skills due to the specialized domain and explicit tool names.	3 / 3
	Total	12 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured skill for counterexample-guided proof debugging with strong actionability and workflow clarity. The code examples are concrete and executable, and the progressive disclosure to reference files is well-organized. Minor improvements could be made by reducing redundancy between the patterns and root causes sections.

Suggestions

Consolidate 'Counterexample Analysis Patterns' and 'Common Root Causes' sections to eliminate redundancy - they cover overlapping content (e.g., missing preconditions appears in both)

Remove or condense the 'Debugging Checklist' section as it largely repeats the workflow steps already defined in section 6

Dimension	Reasoning	Score
Conciseness	The content is reasonably efficient but includes some redundancy, particularly in the 'Common Root Causes' section which overlaps with 'Counterexample Analysis Patterns'. The checklist at the end repeats workflow steps, and some explanatory text could be tightened.	2 / 3
Actionability	Provides concrete, executable code examples in both Isabelle and Coq syntax showing before/after fixes. The patterns section gives specific symptoms and fixes, and the workflow steps are clear and actionable.	3 / 3
Workflow Clarity	Clear 7-step workflow with explicit validation checkpoints ('Retest with fix', 'Verify manually'). The debugging checklist provides a clear sequence with feedback loops for iterative debugging. Steps are well-sequenced for the multi-step debugging process.	3 / 3
Progressive Disclosure	Excellent structure with clear overview sections and well-signaled one-level-deep references to detailed guides (nitpick_guide.md, quickchick_guide.md, examples.md). Content is appropriately split between the main skill and reference files.	3 / 3
	Total	11 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: ArabelaTso/Skills-4-SE
Commit: 0f00a4f

Reviewed: about 2 months ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.