CtrlK
BlogDocsLog inGet started
Tessl Logo

behavior-preservation-checker

Compare runtime behavior between original and migrated repositories to detect behavioral differences, regressions, and semantic changes. Use when validating code migrations, refactorings, language ports, framework upgrades, or any transformation that should preserve behavior. Automatically compares test results, execution traces, API responses, and observable outputs between two repository versions. Provides actionable guidance for fixing deviations and ensuring behavioral equivalence.

Install with Tessl CLI

npx tessl i github:ArabelaTso/Skills-4-SE --skill behavior-preservation-checker
What are skills?

83

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-crafted skill description that excels across all dimensions. It clearly articulates specific capabilities (comparing test results, execution traces, API responses), provides explicit trigger guidance with a 'Use when' clause covering multiple scenarios, and carves out a distinct niche around migration validation that minimizes conflict with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'compares test results, execution traces, API responses, and observable outputs', 'detect behavioral differences, regressions, and semantic changes', 'Provides actionable guidance for fixing deviations'.

3 / 3

Completeness

Clearly answers both what ('Compare runtime behavior...detect behavioral differences') AND when ('Use when validating code migrations, refactorings, language ports, framework upgrades, or any transformation that should preserve behavior'). Explicit 'Use when' clause present.

3 / 3

Trigger Term Quality

Includes natural keywords users would say: 'migrations', 'refactorings', 'language ports', 'framework upgrades', 'behavioral differences', 'regressions', 'test results', 'API responses'. Good coverage of terms developers naturally use.

3 / 3

Distinctiveness Conflict Risk

Clear niche focused on comparing two repository versions for behavioral equivalence during migrations/transformations. Distinct from general testing, code review, or single-repo analysis skills. Specific triggers like 'migrated repositories', 'behavioral equivalence' reduce conflict risk.

3 / 3

Total

12

/

12

Passed

Implementation

57%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides a comprehensive framework for behavior preservation checking with good structure and multiple comparison methods. However, it relies heavily on scripts that aren't provided, making the guidance illustrative rather than truly executable. The workflow could benefit from explicit validation checkpoints to ensure users verify results before acting on them.

Suggestions

Provide actual implementations for the referenced scripts (behavior_checker.py, compare_test_results.py, etc.) or use only standard tools that exist

Add explicit validation checkpoints in the core workflow, such as 'Verify report accuracy by manually checking 2-3 differences before bulk fixing'

Reduce verbosity in the 'What to check' sections by consolidating into concise checklists rather than separate bullet lists for each method

DimensionReasoningScore

Conciseness

The skill is reasonably efficient but includes some unnecessary verbosity, such as explaining what to check in obvious categories and repeating similar patterns across methods. Some sections could be tightened without losing clarity.

2 / 3

Actionability

Provides concrete bash commands and code examples, but many reference scripts that don't exist (behavior_checker.py, compare_test_results.py, etc.) without providing their implementations. The examples are illustrative but not truly executable without the missing scripts.

2 / 3

Workflow Clarity

The core workflow has clear numbered steps, but lacks explicit validation checkpoints and feedback loops. For a process involving behavioral comparison (which can have destructive implications if wrong conclusions are drawn), there should be clearer verification steps before acting on results.

2 / 3

Progressive Disclosure

Well-structured with clear sections, appropriate use of headers, and references to external files (references/comparison_techniques.md, etc.) that are one level deep and clearly signaled. The overview leads naturally into detailed methods.

3 / 3

Total

9

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.