Comprehensive pull request review using specialized agents
50
38%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/review/skills/review-pr/SKILL.mdQuality
Discovery
22%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description is too vague and lacks the detail needed for effective skill selection. It mentions the domain (pull request review) but fails to enumerate specific capabilities, omits a 'Use when...' clause, and uses filler words like 'comprehensive' and 'specialized agents' that don't aid in skill matching. It would be difficult for Claude to confidently select this skill over other code-review-related skills.
Suggestions
List specific concrete actions the skill performs, e.g., 'Reviews code changes for bugs, security issues, style violations, and test coverage in pull requests.'
Add an explicit 'Use when...' clause with natural trigger terms like 'PR review', 'code review', 'review my pull request', 'merge request', 'diff review'.
Remove vague filler like 'comprehensive' and 'specialized agents' and replace with actionable details about what the review covers and what output it produces.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague language like 'comprehensive' and 'specialized agents' without listing any concrete actions. It doesn't specify what aspects of PR review are performed (e.g., checking code style, security issues, test coverage, providing inline comments). | 1 / 3 |
Completeness | The 'what' is vaguely stated as 'pull request review' without specifics, and there is no 'when' clause or explicit trigger guidance at all. The missing 'Use when...' clause would cap this at 2 regardless, but the weak 'what' brings it to 1. | 1 / 3 |
Trigger Term Quality | 'Pull request review' is a natural term users would say, but it's missing common variations like 'PR review', 'code review', 'review my PR', 'diff review', or 'merge request'. 'Specialized agents' is not a term users would naturally use. | 2 / 3 |
Distinctiveness Conflict Risk | The mention of 'pull request review' provides some specificity to a domain, but 'comprehensive' and 'specialized agents' are generic enough that it could overlap with other code review or PR-related skills without clear differentiation. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides an exceptionally detailed and actionable PR review workflow with clear phasing, validation checkpoints, and concrete commands/APIs. However, it is severely over-long and monolithic—much of the content (templates, scoring tables, false positive lists, repeated API instructions) could be condensed or split into referenced files. The verbosity undermines token efficiency significantly despite the high quality of the actual instructions.
Suggestions
Extract the comment templates, false positive examples, and impact/confidence scoring tables into separate referenced files (e.g., TEMPLATES.md, SCORING.md) to reduce the main skill to an overview with clear navigation links.
Remove redundant explanations—the inline comment posting approach is described in both Phase 3 step 4 and again in the 'Template for inline comments using GitHub API' section; consolidate into one location.
Cut the example security/bug issue templates to just one example or move them to a separate file; Claude doesn't need to be taught what SQL injection or null pointer dereference looks like.
Remove general advice Claude already knows (e.g., 'Use numbers, not words like some, many, few', 'Security First') to tighten the content and respect token budget.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~300+ lines. It over-explains concepts Claude already knows (what false positives are, how to use gh CLI, what SQL injection is), includes lengthy example templates that could be condensed, and repeats instructions (e.g., the inline comment posting approach is explained multiple times in different sections). The impact/confidence scoring tables and false positive examples add significant bulk. | 1 / 3 |
Actionability | The skill provides highly concrete, executable guidance: specific git commands, exact API endpoints (gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews), detailed agent prompts, scoring rubrics with numeric thresholds, and complete markdown templates with code suggestion syntax. The workflow is copy-paste actionable. | 3 / 3 |
Workflow Clarity | The three-phase workflow (Preparation → Searching for Issues → Confidence & Impact Scoring) is clearly sequenced with explicit validation checkpoints: eligibility check at start, re-eligibility check before posting, confidence/impact filtering thresholds, and a progressive threshold table for filtering false positives. The feedback loop of score → filter → post is well-defined. | 3 / 3 |
Progressive Disclosure | The entire skill is a monolithic wall of text with no references to external files. All templates, examples, scoring rubrics, agent definitions, and API instructions are inlined. Content like the comment templates, false positive examples, and impact scoring tables could be split into separate reference files to reduce cognitive load. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
dedca19
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.