Review existing code, diffs, branches, or pull requests by spawning mandatory concern-specific reviewer subagents, then synthesize a ship-it / needs-review / blocked verdict.
92
97%
Does it follow best practices?
Impact
81%
1.22xAverage score across 4 eval scenarios
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly articulates what the skill does (audit code/PRs via subagents, produce a verdict), when to use it (PR triage, evaluating others' changes), and when not to use it (self-review). It uses natural trigger terms, provides specific output formats, and is highly distinctive from other potential skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: audit existing code/diffs/branches/PRs, spawn concern-specific reviewer subagents, synthesize evidence into a ship decision, and produce a specific verdict (ship it / needs review / blocked). | 3 / 3 |
Completeness | Clearly answers both what ('audit existing code, diffs, branches, or pull requests by spawning reviewer subagents, synthesizing evidence into a ship decision') and when ('Use when triaging PR risk, deciding whether someone else's change is safe to ship, or following up after runtime proof'). Also includes a 'Do not use' clause for additional clarity. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'audit', 'code', 'diffs', 'branches', 'pull requests', 'PR risk', 'safe to ship', 'review'. These cover common variations of how users would describe code review tasks. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: independent code review with subagent-based architecture producing a specific verdict format. The explicit exclusion ('Do not use to self-check a change you just authored') further sharpens its boundary against potential overlapping skills like linting or self-review. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
92%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted skill that efficiently communicates a complex multi-agent review workflow. Its strengths are strong actionability (concrete prompt templates, git commands, exact output format with example), excellent conciseness (no wasted tokens explaining concepts Claude knows), and clear workflow sequencing with validation checkpoints. The main weakness is that referenced files (reviewer-selection.md, reviewing.md, persona files) cannot be verified since no bundle was provided, and the persona file paths are only implicitly referenced rather than explicitly listed.
Suggestions
Add the individual reviewer persona file paths (e.g., reviewers/general.md, reviewers/tests.md, reviewers/silent-failures.md) to the References section so all referenced files are discoverable from SKILL.md
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is lean and efficient throughout. It assumes Claude's competence with git, subagents, and code review concepts without explaining them. Every section earns its place — no filler, no unnecessary context about what code review is or why it matters. | 3 / 3 |
Actionability | Provides concrete prompt templates for subagent spawning, specific git commands for scoping changes, exact output format with labeled fields, and a complete worked example. The workflow steps are specific and executable rather than abstract. | 3 / 3 |
Workflow Clarity | The four-step workflow (scope → spawn → collect → synthesize) is clearly sequenced with explicit validation checkpoints: confirming base/head is current, treating stale artifacts as unverified, requiring evidence refresh before judging, and including feedback loops for unverified surfaces. The blocked verdict serves as a safety valve when context is missing. | 3 / 3 |
Progressive Disclosure | References to reviewer-selection.md and reviewing.md are well-signaled and one level deep, which is good. However, no bundle files were provided, so we cannot verify these references exist. The individual reviewer persona files (reviewers/<persona>.md) are referenced in the subagent prompt template but not listed in a references section, making navigation less clear. | 2 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents