counterexample-generator

Generate concrete counterexamples when formal verification, assertions, or specifications fail. Use this skill when debugging failed proofs, understanding why verification fails, creating minimal reproducing examples, analyzing assertion violations, investigating invariant breaks, or diagnosing specification mismatches. Produces concrete input values, execution traces, and state information that demonstrate the failure.

1.10x

Quality

77%

Does it follow best practices?

Impact

85%

1.10x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/counterexample-generator/SKILL.md

Evaluation results

100%

16%

Debugging a Failing Bank Account Verification

Structured counterexample report

Criteria

Without context

With context

Failed Property named

100%

Location identified

100%

Concrete inputs given

100%

Expected vs Actual

100%

Step-by-step trace

41%

100%

Root Cause section

100%

Root cause is correct

100%

Suggested Fix section

100%

Fix is correct

100%

Minimal Example section

37%

100%

Regression test cases

50%

100%

96%

Investigating a Bounded Queue Specification Failure

5-step workflow and boundary analysis

Criteria

Without context

With context

Step-by-step process documented

100%

Both bugs found

100%

Violation types labeled

100%

Boundary values used

100%

Concrete inputs for enqueue bug

100%

Concrete inputs for peek bug

100%

Execution trace for at least one bug

100%

Root causes explained

100%

Suggested fixes included

100%

Generalization present

50%

Preconditions identified

100%

60%

Counterexample Analysis for a String Processing Library

Minimal example and generalization

Criteria

Without context

With context

truncate bug found

100%

count_words bug found

100%

pad_center not falsely flagged

37%

25%

Minimal example for truncate

80%

100%

Minimal example for count_words

80%

100%

Minimality verified

25%

50%

Root causes explained

100%

Regression tests provided

Tests cover fix verification

Generalization section

20%

Execution trace present

100%

Repository: ArabelaTso/Skills-4-SE
Commit: 0f00a4f

Evaluated: about 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Debugging a Failing Bank Account Verification Investigating a Bounded Queue Specification Failure Counterexample Analysis for a String Processing Library

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.