CtrlK
BlogDocsLog inGet started
Tessl Logo

counterexample-generator

Generate concrete counterexamples when formal verification, assertions, or specifications fail. Use this skill when debugging failed proofs, understanding why verification fails, creating minimal reproducing examples, analyzing assertion violations, investigating invariant breaks, or diagnosing specification mismatches. Produces concrete input values, execution traces, and state information that demonstrate the failure.

81

1.10x
Quality

77%

Does it follow best practices?

Impact

85%

1.10x

Average score across 3 eval scenarios

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/counterexample-generator/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly defines a specialized niche in formal verification debugging. It provides comprehensive trigger terms covering the domain vocabulary, explicitly states both capabilities and usage conditions, and is highly distinctive from other potential skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'Generate concrete counterexamples', 'debugging failed proofs', 'creating minimal reproducing examples', 'analyzing assertion violations', 'Produces concrete input values, execution traces, and state information'.

3 / 3

Completeness

Clearly answers both what ('Generate concrete counterexamples', 'Produces concrete input values, execution traces, and state information') AND when ('Use this skill when debugging failed proofs, understanding why verification fails...').

3 / 3

Trigger Term Quality

Excellent coverage of natural terms users would say: 'formal verification', 'assertions', 'specifications fail', 'failed proofs', 'verification fails', 'assertion violations', 'invariant breaks', 'specification mismatches', 'counterexamples'.

3 / 3

Distinctiveness Conflict Risk

Very clear niche focused on formal verification and counterexample generation. The specific domain terminology (proofs, invariants, specifications, assertions) makes it highly unlikely to conflict with general debugging or testing skills.

3 / 3

Total

12

/

12

Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides highly actionable, well-structured guidance for counterexample generation with excellent workflow clarity and concrete executable examples. However, it is severely bloated—the content is 5-10x longer than necessary, repeating concepts Claude already understands and including exhaustive pattern catalogs that should be in separate reference files. The token inefficiency significantly undermines its utility.

Suggestions

Reduce content by 70-80%: Keep the 5-step workflow and 2-3 representative patterns, move the rest to a PATTERNS.md reference file

Remove explanatory text about basic concepts (what PDFs are, how assertions work, what race conditions are) - Claude knows these

Create separate files: PATTERNS.md for the 7 patterns, TECHNIQUES.md for generation techniques, TOOLS.md for tool references

Consolidate the counterexample report template and example into a single concise section rather than showing both separately

DimensionReasoningScore

Conciseness

Extremely verbose at ~600+ lines with extensive repetition. Explains basic concepts Claude knows (what counterexamples are, basic Python errors, what assertions do). Many patterns and examples could be consolidated significantly.

1 / 3

Actionability

Provides fully executable Python code examples throughout, concrete input values, step-by-step execution traces, and copy-paste ready counterexample reports. Every pattern includes specific, runnable examples.

3 / 3

Workflow Clarity

Clear 5-step workflow (Identify → Analyze → Generate → Execute → Present) with explicit validation checkpoints. Each step has concrete examples and the workflow includes feedback loops for fixing and re-verifying.

3 / 3

Progressive Disclosure

Monolithic wall of text with no references to external files. All content is inline including 7 detailed patterns, 4 techniques, tools lists, and common scenarios that should be split into separate reference documents.

1 / 3

Total

8

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

skill_md_line_count

SKILL.md is long (848 lines); consider splitting into references/ and linking

Warning

Total

10

/

11

Passed

Repository
ArabelaTso/Skills-4-SE
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.