CtrlK
BlogDocsLog inGet started
Tessl Logo

uinaf/review-gang

Review existing code, diffs, branches, or pull requests by spawning mandatory concern-specific reviewer subagents, then synthesize a ship-it / needs-review / blocked verdict.

87

1.19x
Quality

90%

Does it follow best practices?

Impact

81%

1.19x

Average score across 4 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, actionable skill that clearly defines a multi-agent code review workflow with concrete steps, prompt templates, and a precise output format. Its main strengths are the executable specificity of the subagent spawning instructions and the clear verdict synthesis process. Minor weaknesses include some redundancy between the Contract and Workflow sections, and the inability to verify referenced bundle files.

Suggestions

Reduce redundancy between the Contract section and Workflow steps — for example, stale evidence handling and subagent discovery are mentioned in both places; consolidate to one location.

Provide the referenced bundle files (references/reviewing.md, references/reviewer-selection.md) or note their absence, as the skill depends heavily on them for reviewer persona definitions and evidence handling details.

DimensionReasoningScore

Conciseness

The skill is mostly efficient and avoids explaining concepts Claude already knows, but there's some redundancy — the contract section and workflow sections overlap in places (e.g., stale evidence handling is mentioned in both Contract and Workflow step 3). The Codex-specific paragraph in step 2 adds environment-specific detail that could be trimmed or moved to a reference file.

2 / 3

Actionability

Provides concrete prompt templates for subagent spawning, specific git commands for scoping changes, exact output format with labeled fields, and a worked example. The guidance is specific enough to be directly executable — Claude knows exactly what to spawn, what to run, and how to format the output.

3 / 3

Workflow Clarity

The four-step workflow (scope → spawn → collect evidence → synthesize) is clearly sequenced with explicit validation checkpoints: refresh source of truth before judging, run runtime checks when they change the verdict, mark unverified surfaces explicitly, and block when missing context prevents an honest verdict. The feedback loop of 'if unverified, say so and adjust verdict' is well-defined.

3 / 3

Progressive Disclosure

References to reviewer-selection.md and reviewing.md are well-signaled and one level deep, which is good. However, no bundle files were provided, so we cannot verify these references actually exist. The SKILL.md itself is moderately long (~100 lines of substantive content) and some sections like the Contract could potentially be moved to a reference file to keep the main skill leaner.

2 / 3

Total

10

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates what the skill does (audit code via reviewer subagents to produce a ship verdict), when to use it (PR triage, evaluating others' changes, post-runtime follow-up), and when NOT to use it (self-checking authored changes). The description is specific, uses natural trigger terms, and carves out a distinct niche that would be hard to confuse with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: audit code/diffs/branches/PRs, spawn concern-specific reviewer subagents, synthesize evidence into a ship decision, produce a verdict (ship it / needs review / blocked). Very concrete and actionable.

3 / 3

Completeness

Clearly answers both what ('audit existing code, diffs, branches, or pull requests by spawning reviewer subagents, synthesizing into a ship decision') and when ('Use when triaging PR risk, deciding whether someone else's change is safe to ship, or following up after runtime proof'). Also includes a 'Do not use' clause which adds further clarity on boundaries.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'audit', 'code', 'diffs', 'branches', 'pull requests', 'PR risk', 'ship', 'safe to ship', 'review'. Good coverage of terms a developer would naturally use when requesting a code review.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with its focus on independent code auditing via subagents producing ship/needs-review/blocked verdicts. The explicit exclusion ('Do not use to self-check a change you just authored') further sharpens its niche and reduces conflict with general code review or linting skills.

3 / 3

Total

12

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents