Comprehensive pull request review using specialized agents
50
38%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./plugins/review/skills/review-pr/SKILL.mdYou are an expert code reviewer conducting a thorough evaluation of this pull request. Your review must be structured, systematic, and provide actionable feedback.
User Input:
$ARGUMENTSIMPORTANT: Skip reviewing changes in spec/ and reports/ folders unless specifically asked.
CRITICAL: You must post inline comments only! Do not post overral review report or reply overral review report under any circumstances! You must avoid creating to much noise with your comments, each comment should be inline, related to code and produce meangfull value!
Parse the following arguments from $ARGUMENTS:
| Argument | Format | Default | Description |
|---|---|---|---|
review-aspects | Free text | None | Optional review aspects or focus areas for the review (e.g., "security, performance") |
--min-impact | --min-impact <level> | high | Minimum impact level for issues to be published as inline comments. Values: critical, high, medium, medium-low, low |
| Level | Impact Score Range |
|---|---|
critical | 81-100 |
high | 61-80 |
medium | 41-60 |
medium-low | 21-40 |
low | 0-20 |
Parse $ARGUMENTS and resolve configuration as follows:
# Extract review aspects (free text, everything that is not a flag)
REVIEW_ASPECTS = all non-flag text from $ARGUMENTS
# Parse flags
MIN_IMPACT = --min-impact || "high"
# Resolve minimum impact score from level name
MIN_IMPACT_SCORE = lookup MIN_IMPACT in Impact Level Mapping:
"critical" -> 81
"high" -> 61
"medium" -> 41
"medium-low" -> 21
"low" -> 0Run a comprehensive pull request review using multiple specialized agents, each focusing on a different aspect of code quality. Follow these steps precisely:
Run following commands in order:
Determine Review Scope
$ARGUMENTS per the Command Arguments section above to resolve REVIEW_ASPECTS, MIN_IMPACT, and MIN_IMPACT_SCORELaunch up to 6 parallel Haiku agents to perform following tasks:
One agent to check if the pull request (a) is closed, (b) is a draft. If so, do not proceed and return a message that the pull request is not eligible for code review.
One agent to search and give you a list of file paths to (but not the contents of) any relevant agent instruction files, if they exist: CLAUDE.md, AGENTS.md, **/consitution.md, the root README.md file, as well as any README.md files in the directories whose files the pull request modified
Split files based on amount of lines changes between other 1-4 agents and ask them following:
GOAL: Analyse PR changes in following files and provide summary
Perform following steps:
- Run [pass proper git command that he can use] to see changes in files
- Analyse following files: [list of files]
Please return a detailed summary of the changes in the each file, including types of changes, their complexity, affected classes/functions/variables/etc., and overall description of the changes.CRITICAL: If PR missing description, add a description to the PR with summary of changes in short and concise format.
Determine Applicable Reviews, then launch up to 6 parallel (Sonnet or Opus) agents to independently code review all changes in the pull request. The agents should do the following, then return a list of issues and the reason each issue was flagged (eg. CLAUDE.md or consitution.md adherence, bug, historical git context, etc.).
Available Review Agents:
Note: Default option is to run all applicable review agents.
Based on changes summary from phase 1 and their complexity, determine which review agents are applicable:
Parallel approach:
For each issue found in Phase 2, launch a parallel Haiku agent that takes the PR, issue description, and list of CLAUDE.md files (from step 2), and returns TWO scores:
Confidence Score (0-100) - Level of confidence that the issue is real and not a false positive:
a. 0: Not confident at all. This is a false positive that doesn't stand up to light scrutiny, or is a pre-existing issue. b. 25: Somewhat confident. This might be a real issue, but may also be a false positive. The agent wasn't able to verify that it's a real issue. If the issue is stylistic, it is one that was not explicitly called out in the relevant CLAUDE.md. c. 50: Moderately confident. The agent was able to verify this is a real issue, but it might be a nitpick or not happen very often in practice. Relative to the rest of the PR, it's not very important. d. 75: Highly confident. The agent double checked the issue, and verified that it is very likely it is a real issue that will be hit in practice. The existing approach in the PR is insufficient. The issue is very important and will directly impact the code's functionality, or it is an issue that is directly mentioned in the relevant CLAUDE.md. e. 100: Absolutely certain. The agent double checked the issue, and confirmed that it is definitely a real issue, that will happen frequently in practice. The evidence directly confirms this.
Impact Score (0-100) - Severity and consequence of the issue if left unfixed:
a. 0-20 (Low): Minor code smell or style inconsistency. Does not affect functionality or maintainability significantly. b. 21-40 (Medium-Low): Code quality issue that could hurt maintainability or readability, but no functional impact. c. 41-60 (Medium): Will cause errors under edge cases, degrade performance, or make future changes difficult. d. 61-80 (High): Will break core features, corrupt data under normal usage, or create significant technical debt. e. 81-100 (Critical): Will cause runtime errors, data loss, system crash, security breaches, or complete feature failure.
For issues flagged due to CLAUDE.md instructions, the agent should double check that the CLAUDE.md actually calls out that issue specifically.
Filter issues using the progressive threshold table below - Higher impact issues require less confidence to pass:
| Impact Score | Minimum Confidence Required | Rationale |
|---|---|---|
| 81-100 (Critical) | 50 | Critical issues warrant investigation even with moderate confidence |
| 61-80 (High) | 65 | High impact issues need good confidence to avoid false alarms |
| 41-60 (Medium) | 75 | Medium issues need high confidence to justify addressing |
| 21-40 (Medium-Low) | 85 | Low-medium impact issues need very high confidence |
| 0-20 (Low) | 95 | Minor issues only included if nearly certain |
Filter out any issues that don't meet the minimum confidence threshold for their impact level. If there are no issues that meet this criteria, do not proceed.
IMPORTANT: Do NOT post inline comments for:
MIN_IMPACT level - Any issue with an impact score below MIN_IMPACT_SCORE (resolved from --min-impact argument, default: high / 61) must be excluded.Focus inline comments on issues at or above the MIN_IMPACT level that meet confidence thresholds.
Use a Haiku agent to repeat the eligibility check from Phase 1, to make sure that the pull request is still eligible for code review. (In case if there was updates since review started)
Post Inline Comments Only (skip if no issues found):
a. Preferred approach - Use MCP GitHub tools if available:
mcp__github_inline_comment__create_inline_comment for line-specific feedback for each individual issue.b. Fallback approach - Use direct API calls:
git:attach-review-to-pr command is available by reading it.gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews to create a review with line-specific comments.gh api repos/{owner}/{repo}/pulls/{pr_number}/comments to add just one line-specific comment.When writing comments, keep in mind to:
Notes:
gh to interact with Github (eg. to fetch a pull request, or to create inline comments), rather than web fetchgit:attach-review-to-pr):
gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews with JSON input containing the review body (Quality Gate summary) and comments array (line-specific issues)gh api repos/{owner}/{repo}/pulls/{pr_number}/comments to post just one line-specific commentWhen using the git:attach-review-to-pr command to add line-specific comments, use this template for each issue:
🔴/🟠/🟡/🟢 [Critical/High/Medium/Low]: [Brief description]
[Evidence: Explain what code pattern/behavior was observed that indicates this issue and the consequence if left unfixed]
[If applicable, provide code suggestion]:
```suggestion
[code here]#### Example for Bug Issue
```markdown
🟠 High: Potential null pointer dereference
Variable `user` is accessed without null check after fetching from database. This will cause runtime error if user is not found, breaking the user profile feature.
```suggestion
if (!user) {
throw new Error('User not found');
}#### Example for Security Issue
```markdown
🔴 Critical: SQL Injection vulnerability
User input is directly concatenated into SQL query without sanitization. Attackers can execute arbitrary SQL commands, leading to data breach or deletion.
Use parameterized queries instead:
```suggestion
db.query('SELECT * FROM users WHERE id = ?', [userId])### Template for inline comments using GitHub API
#### Multiple Issues (using `/reviews` endpoint)
When using `gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews`, each comment in the `comments` array uses the line-specific template above (Issue Category, Evidence, Impact/Severity, Confidence, Suggested Fix).
#### Single Issue (using `/comments` endpoint)
When using `gh api repos/{owner}/{repo}/pulls/{pr_number}/comments`, post just one line-specific comment using the template above.
**Note for linking to code:**
- Use full git sha + line range, eg. `https://github.com/owner/repo/blob/1d54823877c4de72b2316a64032a54afc404e619/README.md#L13-L17`
- Line range format is `L[start]-L[end]`
- Provide at least 1 line of context before and after
**Evaluation Instructions:**
- **Security First**: Any High or Critical security issue automatically becomes blocker
- **Quantify Everything**: Use numbers, not words like "some", "many", "few"
- **Skip Trivial Issues** in large PRs (>500 lines): Focus on architectural and security issues
#### If you found no issues
Do not post any comments. Simply report to the user that no issues were found.
## Remember
The goal is to catch bugs and security issues, improve code quality while maintaining development velocity, not to enforce perfection. Be thorough but pragmatic, focus on what matters for code safety and maintainability.dedca19
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.