multi-version-behavior-comparator

Compare behavior across multiple versions of programs or repositories. Use when you need to analyze how functionality changes between versions, identify regressions, compare outputs and exceptions, or validate upgrades. The skill compares execution behavior, test results, outputs, exceptions, and observable states across versions, generating detailed reports showing behavioral divergences, potential regressions, added/removed functionality, and areas requiring validation. Supports multiple programming languages and can work with test suites or execution traces.

1.58x

Quality

76%

Does it follow best practices?

Impact

100%

1.58x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/multi-version-behavior-comparator/SKILL.md

Evaluation results

100%

57%

Library Audit: Identify What Changed Between Releases

Two-version JSON comparison report

Criteria

Without context

With context

Uses compare.py script

58%

100%

Version dirs as positional args

100%

Uses --output flag

100%

Report is JSON format

100%

Behavioral Divergences section

33%

100%

Potential Regressions section

30%

100%

Added Features section

30%

100%

Removed Features section

30%

100%

Validation Areas section

30%

100%

Exceptions comparison

100%

44%

Pinpoint the Regression: Three-Release Text Processor Audit

Multi-version comparison with test suite

Criteria

Without context

With context

Uses compare.py script

70%

100%

Three or more version dirs

25%

100%

Uses --tests flag

100%

Test Results in report

100%

Behavioral Divergences section

87%

100%

Potential Regressions section

100%

Added Features section

25%

100%

Removed Features section

12%

100%

Validation Areas section

100%

Observable States comparison

90%

100%

Report is JSON format

100%

10%

Pre-Deployment Change Verification for Config Service Upgrade

Pre-deployment regression analysis

Criteria

Without context

With context

Uses compare.py script

80%

100%

Version dirs as positional args

100%

Uses --output flag

100%

Report is JSON format

100%

Behavioral Divergences section

100%

Potential Regressions section

100%

Added Features section

100%

Removed Features section

100%

Validation Areas section

100%

Functionality comparison

100%

Outputs comparison

100%

Repository: ArabelaTso/Skills-4-SE
Commit: 0f00a4f

Evaluated: about 2 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Library Audit: Identify What Changed Between Releases Pinpoint the Regression: Three-Release Text Processor Audit Pre-Deployment Change Verification for Config Service Upgrade

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.