CtrlK
BlogDocsLog inGet started
Tessl Logo

multi-version-behavior-comparator

Compare behavior across multiple versions of programs or repositories. Use when you need to analyze how functionality changes between versions, identify regressions, compare outputs and exceptions, or validate upgrades. The skill compares execution behavior, test results, outputs, exceptions, and observable states across versions, generating detailed reports showing behavioral divergences, potential regressions, added/removed functionality, and areas requiring validation. Supports multiple programming languages and can work with test suites or execution traces.

Install with Tessl CLI

npx tessl i github:ArabelaTso/Skills-4-SE --skill multi-version-behavior-comparator
What are skills?

77

Does it follow best practices?

Validation for skill structure

SKILL.md
Review
Evals

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a well-crafted skill description that excels across all dimensions. It provides specific concrete actions, includes natural trigger terms developers would use, explicitly states both what the skill does and when to use it with a clear 'Use when...' clause, and carves out a distinct niche around version comparison that minimizes conflict risk with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'analyze how functionality changes', 'identify regressions', 'compare outputs and exceptions', 'validate upgrades', 'compares execution behavior, test results, outputs, exceptions, and observable states', 'generating detailed reports showing behavioral divergences'.

3 / 3

Completeness

Clearly answers both what (compares behavior across versions, generates reports on divergences) AND when ('Use when you need to analyze how functionality changes between versions, identify regressions, compare outputs and exceptions, or validate upgrades').

3 / 3

Trigger Term Quality

Includes natural keywords users would say: 'versions', 'regressions', 'upgrades', 'compare', 'test results', 'outputs', 'exceptions', 'repositories'. These are terms developers naturally use when discussing version comparison tasks.

3 / 3

Distinctiveness Conflict Risk

Clear niche focused specifically on cross-version behavioral comparison with distinct triggers like 'versions', 'regressions', 'upgrades'. Unlikely to conflict with general code analysis or testing skills due to the specific version comparison focus.

3 / 3

Total

12

/

12

Passed

Implementation

52%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill is concise and well-structured with clear CLI examples, but lacks actual implementation details—the referenced scripts don't exist or aren't explained. The workflow is underdeveloped, missing validation steps for interpreting comparison results and deciding on actions, which is critical for a tool meant to guide 'safe upgrades.'

Suggestions

Provide the actual implementation of scripts/compare.py or explain how to create it, rather than referencing a non-existent tool

Add a workflow section with explicit steps: run comparison → review divergences → validate regressions → decide on action, with checkpoints for each phase

Include an example comparison report output so users understand what to expect and how to interpret results

Add references to detailed documentation for advanced usage patterns (e.g., custom comparison rules, language-specific configurations)

DimensionReasoningScore

Conciseness

The content is lean and efficient, with no unnecessary explanations of concepts Claude would already know. Every section serves a clear purpose without padding.

3 / 3

Actionability

Provides concrete CLI commands that are copy-paste ready, but the commands reference scripts (scripts/compare.py) that aren't provided or explained. No actual implementation or executable code is given—just invocation patterns for a hypothetical tool.

2 / 3

Workflow Clarity

No multi-step workflow with validation checkpoints is provided. The skill describes what to compare and shows CLI usage, but lacks a clear process for interpreting results, handling failures, or validating findings before acting on them.

1 / 3

Progressive Disclosure

Content is well-organized with clear sections, but there are no references to additional documentation for advanced features, implementation details, or examples. For a skill of this complexity, external references would improve navigation.

2 / 3

Total

8

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.