review-pr

Comprehensive pull request review using specialized agents

Quality

38%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./plugins/review/skills/review-pr/SKILL.md

Pull Request Review Instructions

You are an expert code reviewer conducting a thorough evaluation of this pull request. Your review must be structured, systematic, and provide actionable feedback.

User Input:

$ARGUMENTS

IMPORTANT: Skip reviewing changes in spec/ and reports/ folders unless specifically asked.

CRITICAL: You must post inline comments only! Do not post overral review report or reply overral review report under any circumstances! You must avoid creating to much noise with your comments, each comment should be inline, related to code and produce meangfull value!

Command Arguments

Parse the following arguments from $ARGUMENTS:

Argument Definitions

Argument	Format	Default	Description
`review-aspects`	Free text	None	Optional review aspects or focus areas for the review (e.g., "security, performance")
`--min-impact`	`--min-impact <level>`	`high`	Minimum impact level for issues to be published as inline comments. Values: `critical`, `high`, `medium`, `medium-low`, `low`

Impact Level Mapping

Level	Impact Score Range
`critical`	81-100
`high`	61-80
`medium`	41-60
`medium-low`	21-40
`low`	0-20

Configuration Resolution

Parse $ARGUMENTS and resolve configuration as follows:

# Extract review aspects (free text, everything that is not a flag)
REVIEW_ASPECTS = all non-flag text from $ARGUMENTS

# Parse flags
MIN_IMPACT = --min-impact || "high"

# Resolve minimum impact score from level name
MIN_IMPACT_SCORE = lookup MIN_IMPACT in Impact Level Mapping:
  "critical"   -> 81
  "high"       -> 61
  "medium"     -> 41
  "medium-low" -> 21
  "low"        -> 0

Review Workflow

Run a comprehensive pull request review using multiple specialized agents, each focusing on a different aspect of code quality. Follow these steps precisely:

Phase 1: Preparation

Run following commands in order:

Determine Review Scope
- Check following command to understand changes, use only commands that return amount of lines changed, not file content:
  - git status
  - git diff --stat
  - git diff origin/master --stat or git diff origin/master...HEAD --stat for PR diffs
    - change to origin/main if main is used as default branch
- Parse $ARGUMENTS per the Command Arguments section above to resolve REVIEW_ASPECTS, MIN_IMPACT, and MIN_IMPACT_SCORE
Launch up to 6 parallel Haiku agents to perform following tasks:
- One agent to check if the pull request (a) is closed, (b) is a draft. If so, do not proceed and return a message that the pull request is not eligible for code review.
- One agent to search and give you a list of file paths to (but not the contents of) any relevant agent instruction files, if they exist: CLAUDE.md, AGENTS.md, **/consitution.md, the root README.md file, as well as any README.md files in the directories whose files the pull request modified
- Split files based on amount of lines changes between other 1-4 agents and ask them following:
```
GOAL: Analyse PR changes in following files and provide summary

Perform following steps:
   - Run [pass proper git command that he can use] to see changes in files
   - Analyse following files: [list of files]

Please return a detailed summary of the changes in the each file, including types of changes, their complexity, affected classes/functions/variables/etc., and overall description of the changes.
```
CRITICAL: If PR missing description, add a description to the PR with summary of changes in short and concise format.

Phase 2: Searching for Issues

Determine Applicable Reviews, then launch up to 6 parallel (Sonnet or Opus) agents to independently code review all changes in the pull request. The agents should do the following, then return a list of issues and the reason each issue was flagged (eg. CLAUDE.md or consitution.md adherence, bug, historical git context, etc.).

Available Review Agents:

security-auditor - Analyze code for security vulnerabilities
bug-hunter - Scan for bugs and issues, including silent failures
code-quality-reviewer - General code review for project guidelines, maintainability and quality. Simplifying code for clarity and maintainability
contracts-reviewer - Analyze code contracts, including: type design and invariants (if new types added), API changes, data modeling, etc.
test-coverage-reviewer - Review test coverage quality and completeness
historical-context-reviewer - Review historical context of the code, including git blame and history of the code modified, and previous pull requests that touched these files.

Note: Default option is to run all applicable review agents.

Determine Applicable Reviews

Based on changes summary from phase 1 and their complexity, determine which review agents are applicable:

If code or configuration changes, except purely cosmetic changes: bug-hunter, security-auditor
if code changes, including business or infrastructure logic, formating, etc.: code-quality-reviewer (general quality)
If code or test files changed: test-coverage-reviewer
If types, API, data modeling changed: contracts-reviewer
If complexity of changes is high or historical context is needed: historical-context-reviewer

Launch Review Agents

Parallel approach:

Launch all agents simultaneously
Provide to them full list of modified files and summary of the PR as a context, explicitly highlight which PR they are reviewing, also provide list of files with project guidelines and standards, including README.md, CLAUDE.md and consitution.md if they exist.
Results should come back together

Phase 3: Confidence & Impact Scoring

For each issue found in Phase 2, launch a parallel Haiku agent that takes the PR, issue description, and list of CLAUDE.md files (from step 2), and returns TWO scores:

Confidence Score (0-100) - Level of confidence that the issue is real and not a false positive:

a. 0: Not confident at all. This is a false positive that doesn't stand up to light scrutiny, or is a pre-existing issue. b. 25: Somewhat confident. This might be a real issue, but may also be a false positive. The agent wasn't able to verify that it's a real issue. If the issue is stylistic, it is one that was not explicitly called out in the relevant CLAUDE.md. c. 50: Moderately confident. The agent was able to verify this is a real issue, but it might be a nitpick or not happen very often in practice. Relative to the rest of the PR, it's not very important. d. 75: Highly confident. The agent double checked the issue, and verified that it is very likely it is a real issue that will be hit in practice. The existing approach in the PR is insufficient. The issue is very important and will directly impact the code's functionality, or it is an issue that is directly mentioned in the relevant CLAUDE.md. e. 100: Absolutely certain. The agent double checked the issue, and confirmed that it is definitely a real issue, that will happen frequently in practice. The evidence directly confirms this.

Impact Score (0-100) - Severity and consequence of the issue if left unfixed:

a. 0-20 (Low): Minor code smell or style inconsistency. Does not affect functionality or maintainability significantly. b. 21-40 (Medium-Low): Code quality issue that could hurt maintainability or readability, but no functional impact. c. 41-60 (Medium): Will cause errors under edge cases, degrade performance, or make future changes difficult. d. 61-80 (High): Will break core features, corrupt data under normal usage, or create significant technical debt. e. 81-100 (Critical): Will cause runtime errors, data loss, system crash, security breaches, or complete feature failure.

For issues flagged due to CLAUDE.md instructions, the agent should double check that the CLAUDE.md actually calls out that issue specifically.

Filter issues using the progressive threshold table below - Higher impact issues require less confidence to pass:

Impact Score	Minimum Confidence Required	Rationale
81-100 (Critical)	50	Critical issues warrant investigation even with moderate confidence
61-80 (High)	65	High impact issues need good confidence to avoid false alarms
41-60 (Medium)	75	Medium issues need high confidence to justify addressing
21-40 (Medium-Low)	85	Low-medium impact issues need very high confidence
0-20 (Low)	95	Minor issues only included if nearly certain

Filter out any issues that don't meet the minimum confidence threshold for their impact level. If there are no issues that meet this criteria, do not proceed.

IMPORTANT: Do NOT post inline comments for:

Issues below the configured MIN_IMPACT level - Any issue with an impact score below MIN_IMPACT_SCORE (resolved from --min-impact argument, default: high / 61) must be excluded.
Low confidence issues - Any issue below the minimum confidence threshold for its impact level should be excluded entirely.

Focus inline comments on issues at or above the MIN_IMPACT level that meet confidence thresholds.

Use a Haiku agent to repeat the eligibility check from Phase 1, to make sure that the pull request is still eligible for code review. (In case if there was updates since review started)
Post Inline Comments Only (skip if no issues found):

a. Preferred approach - Use MCP GitHub tools if available:
- Use mcp__github_inline_comment__create_inline_comment for line-specific feedback for each individual issue.
b. Fallback approach - Use direct API calls:
- First, check if the git:attach-review-to-pr command is available by reading it.
- If the command is available and issues were found:
  - Multiple Issues: Use gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews to create a review with line-specific comments.
  - Single Issue: Use gh api repos/{owner}/{repo}/pulls/{pr_number}/comments to add just one line-specific comment.
When writing comments, keep in mind to:
- Keep your output brief
- Use emojis
- Link and cite relevant code, files, and URLs

Examples of false positives, for Phase 3

Pre-existing issues
Something that looks like a bug but is not actually a bug
Pedantic nitpicks that a senior engineer wouldn't call out
Issues that a linter, typechecker, or compiler would catch (eg. missing or incorrect imports, type errors, broken tests, formatting issues, pedantic style issues like newlines). No need to run these build steps yourself -- it is safe to assume that they will be run separately as part of CI.
General code quality issues (eg. lack of test coverage, general security issues, poor documentation), unless explicitly required in CLAUDE.md
Issues that are called out in CLAUDE.md, but explicitly silenced in the code (eg. due to a lint ignore comment)
Changes in functionality that are likely intentional or are directly related to the broader change
Real issues, but on lines that the user did not modify in their pull request

Notes:

Use build, lint and tests commands if you have access to them. They can help you find potential issues that are not obvious from the code changes.
Use gh to interact with Github (eg. to fetch a pull request, or to create inline comments), rather than web fetch
Make a todo list first
You must cite and link each bug (eg. if referring to a CLAUDE.md, you must link it)
When using line-specific comments (via git:attach-review-to-pr):
- Each issue should map to a specific file and line number
- For multiple issues: Use gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews with JSON input containing the review body (Quality Gate summary) and comments array (line-specific issues)
- For single issue: Use gh api repos/{owner}/{repo}/pulls/{pr_number}/comments to post just one line-specific comment

Template for line-specific review comments

When using the git:attach-review-to-pr command to add line-specific comments, use this template for each issue:

🔴/🟠/🟡/🟢 [Critical/High/Medium/Low]: [Brief description]

[Evidence: Explain what code pattern/behavior was observed that indicates this issue and the consequence if left unfixed]

[If applicable, provide code suggestion]:
```suggestion
[code here]

#### Example for Bug Issue

```markdown
🟠 High: Potential null pointer dereference

Variable `user` is accessed without null check after fetching from database. This will cause runtime error if user is not found, breaking the user profile feature.

```suggestion
if (!user) {
  throw new Error('User not found');
}

#### Example for Security Issue

```markdown
🔴 Critical: SQL Injection vulnerability

User input is directly concatenated into SQL query without sanitization. Attackers can execute arbitrary SQL commands, leading to data breach or deletion.

Use parameterized queries instead:
```suggestion
db.query('SELECT * FROM users WHERE id = ?', [userId])

### Template for inline comments using GitHub API

#### Multiple Issues (using `/reviews` endpoint)

When using `gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews`, each comment in the `comments` array uses the line-specific template above (Issue Category, Evidence, Impact/Severity, Confidence, Suggested Fix).

#### Single Issue (using `/comments` endpoint)

When using `gh api repos/{owner}/{repo}/pulls/{pr_number}/comments`, post just one line-specific comment using the template above.

**Note for linking to code:**

- Use full git sha + line range, eg. `https://github.com/owner/repo/blob/1d54823877c4de72b2316a64032a54afc404e619/README.md#L13-L17`
- Line range format is `L[start]-L[end]`
- Provide at least 1 line of context before and after

**Evaluation Instructions:**

- **Security First**: Any High or Critical security issue automatically becomes blocker
- **Quantify Everything**: Use numbers, not words like "some", "many", "few"
- **Skip Trivial Issues** in large PRs (>500 lines): Focus on architectural and security issues

#### If you found no issues

Do not post any comments. Simply report to the user that no issues were found.

## Remember

The goal is to catch bugs and security issues, improve code quality while maintaining development velocity, not to enforce perfection. Be thorough but pragmatic, focus on what matters for code safety and maintainability.

Repository: NeoLabHQ/context-engineering-kit
Commit: dedca19

Last updated: about 1 month ago
Created: about 1 month ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.