CtrlK
BlogDocsLog inGet started
Tessl Logo

best-of

This skill should be used when the user asks to "compare code", "compare worktrees", "compare solutions", "which solution is better", "compare branches", "best of", "diff worktrees", "evaluate solutions", "pick the better implementation", "compare implementations", "review both solutions", or wants a structured, criteria-driven comparison of code across two git worktrees. Also triggered by the /best-of command.

68

Quality

83%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

89%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description excels at trigger term coverage and completeness, providing an extensive list of natural phrases that would activate the skill and clearly stating both what it does and when to use it. Its main weakness is that it focuses heavily on trigger terms at the expense of describing the specific actions and outputs of the skill (e.g., what does the comparison produce?). The description could also be more concise by summarizing the trigger terms rather than exhaustively listing them.

Suggestions

Add specific concrete actions describing what the skill produces, e.g., 'Performs structured, criteria-driven comparison of code across two git worktrees, generating scored evaluations across dimensions like correctness, performance, and readability.'

DimensionReasoningScore

Specificity

The description mentions comparing code across git worktrees and criteria-driven comparison, but it doesn't list specific concrete actions beyond 'compare' and 'evaluate'. It lacks detail on what the comparison produces (e.g., generates a report, scores implementations, produces a summary table).

2 / 3

Completeness

The description explicitly answers both 'what' (structured, criteria-driven comparison of code across two git worktrees) and 'when' (with a comprehensive list of trigger phrases and the /best-of command). The 'Use when' guidance is clearly present via 'This skill should be used when...'.

3 / 3

Trigger Term Quality

Excellent coverage of natural trigger terms: 'compare code', 'compare worktrees', 'which solution is better', 'compare branches', 'best of', 'diff worktrees', 'evaluate solutions', 'pick the better implementation', 'compare implementations', 'review both solutions', and the /best-of command. These are terms users would naturally say.

3 / 3

Distinctiveness Conflict Risk

The skill is clearly scoped to comparing code across two git worktrees, which is a distinct niche. The combination of 'worktrees', 'compare implementations', and 'criteria-driven comparison' makes it unlikely to conflict with general code review or diff skills.

3 / 3

Total

11

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-crafted, highly actionable skill with excellent workflow clarity and a sophisticated multi-agent comparison framework. Its main weaknesses are moderate verbosity (the three full agent prompts inline add significant length) and a referenced bundle file that doesn't actually exist. The scoring methodology with weighted criteria and structured output template is a strong design choice.

Suggestions

Extract the three agent prompt templates into a bundled reference file (e.g., `references/agent-prompts.md`) to reduce inline verbosity and improve progressive disclosure.

Provide the referenced `references/evaluation-criteria.md` bundle file, or remove the reference to avoid a phantom reference that could confuse Claude at runtime.

DimensionReasoningScore

Conciseness

The skill is thorough but verbose in places — the agent prompts are fully spelled out inline (could be referenced), and some instructions like 'Extract the two worktree paths from the arguments' are obvious to Claude. However, most content is substantive and earns its place given the complexity of the task.

2 / 3

Actionability

Highly actionable with concrete bash commands, specific file patterns to glob, exact agent prompts, a detailed scoring table with explicit weights, and a complete output template. Nearly every step is copy-paste executable.

3 / 3

Workflow Clarity

The 6-step workflow is clearly sequenced with explicit validation (Step 1 validates worktrees and stops on error, Step 2 builds a checklist before analysis, Step 3 has three strategies with clear selection criteria, Step 5 has explicit scoring before verdict). The feedback loop of asking the user for missing paths and the truncation warning in Step 3 show good error recovery design.

3 / 3

Progressive Disclosure

References `references/evaluation-criteria.md` which is appropriate, but no bundle file is actually provided to support it — this is a phantom reference in practice. The agent prompts (which are lengthy) could be extracted to a reference file. The inline content is well-structured with headers but the skill is quite long and monolithic.

2 / 3

Total

10

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
pmatos/skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.