Proves two program fragments semantically equivalent using symbolic reasoning — stronger than testing, applicable when differential testing is insufficient or impossible. Use when behavior preservation must be proven rather than sampled, when the input space is too large to enumerate, or when a transformation needs a correctness argument.
Install with Tessl CLI
npx tessl i github:santosomar/general-secure-coding-agent-skills --skill semantic-equivalence-verifier94
Quality
92%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Discovery
85%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong description that clearly articulates a specialized capability (semantic equivalence proofs) with explicit trigger conditions. The main weakness is the technical jargon in trigger terms - users seeking this capability might use more natural language like 'prove my refactoring is correct' rather than 'semantic equivalence'. The description excels at distinctiveness and completeness.
Suggestions
Add more natural user-facing trigger terms like 'prove refactoring is correct', 'verify code transformation', 'formal equivalence', or 'prove optimization preserves behavior'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists specific concrete actions: 'proves two program fragments semantically equivalent using symbolic reasoning' - clearly describes the technique (symbolic reasoning) and the goal (semantic equivalence proofs). Also mentions 'correctness argument' and 'behavior preservation'. | 3 / 3 |
Completeness | Clearly answers both what ('Proves two program fragments semantically equivalent using symbolic reasoning') and when ('Use when behavior preservation must be proven rather than sampled, when the input space is too large to enumerate, or when a transformation needs a correctness argument'). Explicit 'Use when' clause with multiple trigger conditions. | 3 / 3 |
Trigger Term Quality | Includes some relevant terms like 'semantically equivalent', 'symbolic reasoning', 'differential testing', 'behavior preservation', 'correctness argument', but these are fairly technical. Missing more natural user terms like 'prove code is the same', 'verify refactoring', 'equivalence proof', or 'formal verification'. | 2 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche - semantic equivalence proofs via symbolic reasoning is a very specific domain. Explicitly contrasts with 'differential testing' to clarify when NOT to use it. Unlikely to conflict with general testing or code analysis skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
100%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
Excellent skill content that efficiently teaches formal equivalence verification. The worked example with SMT-LIB code is particularly strong, including the nuanced discussion of INT_MIN behavior. Decision tables for strategy selection and failure handling make this immediately actionable for complex verification tasks.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Extremely efficient use of tokens. Tables compress decision logic, no explanation of concepts Claude knows (SMT, bitvectors, induction). Every section earns its place with actionable content. | 3 / 3 |
Actionability | Provides complete, executable SMT-LIB code for the worked example. Clear decision tables for strategy selection. The output format template is copy-paste ready. | 3 / 3 |
Workflow Clarity | Four-step SMT workflow is explicit with clear sequence. Failure modes table provides feedback loops (SAT → check model, timeout → add lemmas). The 'When proof fails' section handles error recovery. | 3 / 3 |
Progressive Disclosure | Well-structured with clear sections progressing from when-to-use → strategies → step-by-step → example → failures → limits → output format. References to related skills (behavior-preservation-checker, python-to-dafny-translator) are one level deep and clearly signaled. | 3 / 3 |
Total | 12 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.