counterexample-explainer

Explain why counterexamples violate specifications by analyzing formal specifications (temporal logic, invariants, pre/postconditions, code contracts), informal requirements (user stories, acceptance criteria), test specifications (assertions, property-based tests), and providing step-by-step traces showing state changes, comparing expected vs actual behavior, identifying root causes, and assessing violation impact. Use when debugging test failures, understanding model checker output, explaining runtime assertion violations, analyzing static analysis warnings, or teaching specification concepts. Produces structured markdown explanations with traces, comparisons, state diagrams, and cause chains. Triggers when users ask why something failed, explain a violation, understand a counterexample, debug a specification, or analyze why a test fails.

1.45x

Quality

92%

Does it follow best practices?

Impact

96%

1.45x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that thoroughly covers what the skill does (analyzing specifications and explaining violations), when to use it (explicit trigger scenarios), and what it produces (structured markdown with traces and diagrams). The description uses proper third-person voice throughout and includes both technical terms for precision and natural language triggers for discoverability.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'analyzing formal specifications (temporal logic, invariants, pre/postconditions, code contracts)', 'providing step-by-step traces showing state changes', 'comparing expected vs actual behavior', 'identifying root causes', and specifies output format 'structured markdown explanations with traces, comparisons, state diagrams, and cause chains'.	3 / 3
Completeness	Clearly answers both what (analyzing specifications, providing traces, comparing behavior, identifying root causes) AND when with explicit 'Use when...' and 'Triggers when...' clauses that specify concrete scenarios like 'debugging test failures' and 'when users ask why something failed'.	3 / 3
Trigger Term Quality	Excellent coverage of natural terms users would say: 'why something failed', 'explain a violation', 'understand a counterexample', 'debug a specification', 'why a test fails', 'debugging test failures', 'model checker output', 'runtime assertion violations', 'static analysis warnings'.	3 / 3
Distinctiveness Conflict Risk	Clear niche focused specifically on counterexample explanation and specification violation analysis - distinct from general debugging or testing skills. The combination of formal methods terminology (temporal logic, invariants, model checker) with practical triggers creates a unique, well-defined scope.	3 / 3
	Total	12 / 12 Passed

Implementation

85%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured skill that provides clear, actionable guidance for explaining counterexamples. The workflow is logically sequenced with good examples demonstrating both test failures and temporal logic violations. The main weakness is some verbosity in the trace examples and common patterns sections that could be tightened without losing clarity.

Suggestions

Condense the detailed trace example in section 4 - the full markdown template followed by a complete example is redundant; one comprehensive example would suffice

Move the 'Common Violation Types' and 'Common root causes' sections to a reference file to reduce main skill length while preserving the information

Dimension	Reasoning	Score
Conciseness	The skill is moderately efficient but includes some redundancy - the detailed trace example in section 4 is quite lengthy, and some explanations (like common root causes) could be more condensed. However, it avoids explaining basic concepts Claude would know.	2 / 3
Actionability	Provides fully executable code examples, concrete bash commands for gathering test failure info, complete markdown templates for explanations, and detailed worked examples showing exact input/output formats. The guidance is copy-paste ready.	3 / 3
Workflow Clarity	Clear 8-step workflow with explicit sequencing from understanding specification through presenting complete explanation. Each step has clear purpose and the workflow includes validation checkpoints (identify violation point, compare expected vs actual). The process is well-structured for a complex analytical task.	3 / 3
Progressive Disclosure	Good structure with overview, workflow steps, examples, and tips sections. Appropriately references external files (specification-types.md, explanation-patterns.md) for detailed catalogs while keeping the main skill focused. Navigation is clear with one-level-deep references.	3 / 3
	Total	11 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
skill_md_line_count	SKILL.md is long (598 lines); consider splitting into references/ and linking	Warning

	Total	10 / 11 Passed

Repository: ArabelaTso/Skills-4-SE
Commit: 0f00a4f

Reviewed: about 2 months ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.