CtrlK
BlogDocsLog inGet started
Tessl Logo

behavior-preservation-checker

Compare runtime behavior between original and migrated repositories to detect behavioral differences, regressions, and semantic changes. Use when validating code migrations, refactorings, language ports, framework upgrades, or any transformation that should preserve behavior. Automatically compares test results, execution traces, API responses, and observable outputs between two repository versions. Provides actionable guidance for fixing deviations and ensuring behavioral equivalence.

Install with Tessl CLI

npx tessl i github:ArabelaTso/Skills-4-SE --skill behavior-preservation-checker
What are skills?

83

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Evaluation results

95%

70%

Validate a Refactored Statistics Library

Test-based comparison workflow

Criteria
Without context
With context

pytest json-report flag

0%

100%

json-report-file flag

0%

100%

compare_test_results.py usage

0%

50%

Report behavioral_equivalence

50%

100%

Report summary fields

0%

100%

Severity classification

100%

100%

Differences list

0%

100%

Recommendations list

0%

100%

Separate result files

0%

100%

Variance regression identified

100%

100%

Without context: $0.4355 · 5m 54s · 22 turns · 27 in / 6,918 out tokens

With context: $1.1129 · 3m 43s · 39 turns · 4,129 in / 11,614 out tokens

96%

73%

Validate a Ported String Processing Library

behavior_checker.py orchestrator and report format

Criteria
Without context
With context

behavior_checker.py invocation

41%

66%

behavior_report.json created

100%

100%

Report summary section

20%

100%

passed_original_failed_migrated count

0%

100%

Differences array

0%

100%

Severity on differences

0%

100%

Critical severity regression

0%

100%

Recommendations array

0%

100%

Guidance on differences

0%

100%

behavioral_equivalence percentage

0%

100%

Workflow log

100%

100%

Without context: $0.6131 · 4m 31s · 25 turns · 32 in / 9,137 out tokens

With context: $1.1942 · 4m 51s · 40 turns · 4,088 in / 15,387 out tokens

100%

35%

Validate a Migrated Sorting and Search Library

Property-based testing and multi-method comparison

Criteria
Without context
With context

hypothesis import

0%

100%

@given decorator used

0%

100%

strategies used

0%

100%

Equivalence assertion in property tests

50%

100%

Multiple comparison methods

100%

100%

Tolerance threshold applied

100%

100%

behavior_report.json present

100%

100%

Report differences populated

100%

100%

Report recommendations

100%

100%

Severity levels present

100%

100%

comparison_notes.md present

100%

100%

Without context: $1.2700 · 9m 27s · 38 turns · 42 in / 20,108 out tokens

With context: $2.6824 · 12m 26s · 58 turns · 2,199 in / 31,020 out tokens

Evaluated
Agent
Claude Code

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.