Interprets and explains counterexamples produced by model checkers or property-based testing tools to make them actionable. Use when TLC, NuSMV, CBMC, or a property-based test emits a counterexample the user doesn't understand, when a trace is too long to read, or when mapping a model-level trace back to source code.
Install with Tessl CLI
npx tessl i github:santosomar/general-secure-coding-agent-skills --skill counterexample-debugger100
Quality
100%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines a specialized niche in formal verification and model checking. It provides specific capabilities, names concrete tools users would mention, and explicitly states multiple trigger conditions. The description is well-structured with a clear 'Use when...' clause covering various scenarios.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple concrete actions: 'Interprets and explains counterexamples', 'make them actionable', 'mapping a model-level trace back to source code'. These are specific, tangible capabilities. | 3 / 3 |
Completeness | Clearly answers both what ('Interprets and explains counterexamples...to make them actionable') and when ('Use when TLC, NuSMV, CBMC, or a property-based test emits a counterexample the user doesn't understand, when a trace is too long to read, or when mapping a model-level trace back to source code'). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'TLC', 'NuSMV', 'CBMC', 'property-based testing', 'counterexample', 'trace', 'model checker'. These are the exact terms users working with these tools would use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche targeting formal verification and model checking tools. The specific tool names (TLC, NuSMV, CBMC) and domain-specific terms (counterexample, trace, model-level) make conflicts with other skills extremely unlikely. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
Excellent skill content that demonstrates mastery of the format. The compression algorithm is the core value-add—it transforms an 84-line trace into 4 meaningful steps. The worked example makes the abstract concrete, and the output format ensures consistent, actionable results. Edge cases section anticipates real debugging scenarios.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Every section earns its place. No explanation of what model checkers are or how they work—assumes Claude knows. The compression algorithm and worked example are dense with actionable information, no padding. | 3 / 3 |
Actionability | Provides a concrete compression algorithm with numbered steps, a fully worked TLC example showing before/after, specific output format template, and clear 'do not' constraints. Copy-paste ready output structure. | 3 / 3 |
Workflow Clarity | Clear multi-step workflow: triage by source → compress (4 explicit steps) → narrate (4-part structure) → output. The worked example demonstrates the full pipeline. Edge cases provide validation checkpoints for unusual situations. | 3 / 3 |
Progressive Disclosure | Well-organized with clear sections (triage table, algorithm, worked example, edge cases, do-nots, output format). Single reference to external skill (`counterexample-to-test-generator`) is appropriately signaled. Content is appropriately scoped for a single file. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.