Use this skill to review code. It supports both local changes (staged or working tree) and remote Pull Requests (by ID or URL). It focuses on correctness, maintainability, and adherence to project standards.
71
56%
Does it follow best practices?
Impact
99%
1.54xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.gemini/skills/code-reviewer/SKILL.mdQuality
Discovery
50%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description communicates the core purpose (code review) and distinguishes between local and remote review modes, which is helpful. However, it lacks an explicit 'Use when...' clause with natural trigger terms, and the review actions described are high-level rather than concrete. It uses second person ('Use this skill') which slightly detracts from the preferred third-person voice.
Suggestions
Add an explicit 'Use when...' clause with natural trigger terms like 'review my code', 'PR review', 'check my changes', 'review pull request', 'code feedback', 'diff review'.
List more concrete review actions such as 'identifies bugs, suggests refactoring, checks for security issues, validates naming conventions, and flags performance concerns'.
Rewrite in third person voice (e.g., 'Reviews code for correctness...' instead of 'Use this skill to review code').
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (code review) and mentions some specific modes (local changes, staged/working tree, remote Pull Requests by ID or URL), but doesn't list concrete review actions beyond general focus areas like 'correctness, maintainability, and adherence to project standards'. | 2 / 3 |
Completeness | The 'what' is partially addressed (review code for correctness, maintainability, standards) and there's an implicit 'when' via 'Use this skill to review code', but there's no explicit 'Use when...' clause with trigger scenarios or terms that would guide Claude's selection. | 2 / 3 |
Trigger Term Quality | Includes some natural keywords like 'review code', 'Pull Requests', 'staged', and 'PR' (implied via 'Pull Requests'), but misses common user variations like 'PR review', 'code review', 'diff', 'CR', 'merge request', or file-type triggers. | 2 / 3 |
Distinctiveness Conflict Risk | Code review is a reasonably specific niche, and the mention of Pull Requests and local changes helps distinguish it, but 'correctness, maintainability, and adherence to project standards' is generic enough to overlap with linting, static analysis, or general coding assistance skills. | 2 / 3 |
Total | 8 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a competent code review skill with a well-structured workflow and clear branching for remote vs local reviews. Its main weaknesses are explaining review concepts Claude already understands (correctness, maintainability, etc.) and lacking concrete examples of good review feedback output. The actionable commands for git and GitHub CLI are helpful, but the core analysis section reads more like a checklist of abstract qualities than executable guidance.
Suggestions
Replace the verbose analysis pillars descriptions with a concise checklist — Claude already knows what correctness, maintainability, and security mean. Just list the categories without definitions.
Add a concrete example of expected review output (e.g., a sample finding with file path, line reference, severity, and explanation) to make the feedback format actionable rather than abstract.
Include a specific example of a critical finding vs an improvement vs a nitpick to calibrate the severity classification.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill includes some unnecessary elaboration (e.g., explaining what correctness, maintainability, readability mean — Claude already knows these concepts). The analysis pillars section could be significantly condensed. However, it's not egregiously verbose. | 2 / 3 |
Actionability | The skill provides some concrete commands (gh pr checkout, git diff, npm run preflight) but the core review analysis section is descriptive rather than instructive — it lists abstract qualities to check without concrete examples of what good feedback looks like or specific patterns to flag. | 2 / 3 |
Workflow Clarity | The workflow is clearly sequenced with numbered steps, branching paths for remote vs local reviews, and a structured feedback format. The preflight step serves as a validation checkpoint before deeper analysis, and the feedback structure provides a clear output template. | 3 / 3 |
Progressive Disclosure | The content is reasonably well-organized with clear sections, but the analysis pillars section is quite long and could be extracted to a separate reference file. For a skill of this length (~80 lines of content), some content would benefit from being split out, particularly the detailed review criteria. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
9f76f34
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.