CtrlK
BlogDocsLog inGet started
Tessl Logo

semantic-equivalence-verifier

Proves two program fragments semantically equivalent using symbolic reasoning — stronger than testing, applicable when differential testing is insufficient or impossible. Use when behavior preservation must be proven rather than sampled, when the input space is too large to enumerate, or when a transformation needs a correctness argument.

Install with Tessl CLI

npx tessl i github:santosomar/general-secure-coding-agent-skills --skill semantic-equivalence-verifier
What are skills?

94

Quality

92%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SKILL.md
Review
Evals

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong description that clearly articulates a specialized capability (semantic equivalence proofs) with explicit trigger conditions. The main weakness is the technical jargon in trigger terms - users seeking this capability might use more natural language like 'prove my refactoring is correct' rather than 'semantic equivalence'. The description excels at distinctiveness and completeness.

Suggestions

Add more natural user-facing trigger terms like 'prove refactoring is correct', 'verify code transformation', 'formal equivalence', or 'prove optimization preserves behavior'

DimensionReasoningScore

Specificity

Lists specific concrete actions: 'proves two program fragments semantically equivalent using symbolic reasoning' - clearly describes the technique (symbolic reasoning) and the goal (semantic equivalence proofs). Also mentions 'correctness argument' and 'behavior preservation'.

3 / 3

Completeness

Clearly answers both what ('Proves two program fragments semantically equivalent using symbolic reasoning') and when ('Use when behavior preservation must be proven rather than sampled, when the input space is too large to enumerate, or when a transformation needs a correctness argument'). Explicit 'Use when' clause with multiple trigger conditions.

3 / 3

Trigger Term Quality

Includes some relevant terms like 'semantically equivalent', 'symbolic reasoning', 'differential testing', 'behavior preservation', 'correctness argument', but these are fairly technical. Missing more natural user terms like 'prove code is the same', 'verify refactoring', 'equivalence proof', or 'formal verification'.

2 / 3

Distinctiveness Conflict Risk

Highly distinctive niche - semantic equivalence proofs via symbolic reasoning is a very specific domain. Explicitly contrasts with 'differential testing' to clarify when NOT to use it. Unlikely to conflict with general testing or code analysis skills.

3 / 3

Total

11

/

12

Passed

Implementation

100%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

Excellent skill content that efficiently teaches formal equivalence verification. The worked example with SMT-LIB code is particularly strong, including the nuanced discussion of INT_MIN behavior. Decision tables for strategy selection and failure handling make this immediately actionable for complex verification tasks.

DimensionReasoningScore

Conciseness

Extremely efficient use of tokens. Tables compress decision logic, no explanation of concepts Claude knows (SMT, bitvectors, induction). Every section earns its place with actionable content.

3 / 3

Actionability

Provides complete, executable SMT-LIB code for the worked example. Clear decision tables for strategy selection. The output format template is copy-paste ready.

3 / 3

Workflow Clarity

Four-step SMT workflow is explicit with clear sequence. Failure modes table provides feedback loops (SAT → check model, timeout → add lemmas). The 'When proof fails' section handles error recovery.

3 / 3

Progressive Disclosure

Well-structured with clear sections progressing from when-to-use → strategies → step-by-step → example → failures → limits → output format. References to related skills (behavior-preservation-checker, python-to-dafny-translator) are one level deep and clearly signaled.

3 / 3

Total

12

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.