This skill should be used when the user says "review PR", "review pull request", "check PR comments", "review PR feedback", "review PR 123", "analyze PR comments", "validate PR review", "address PR feedback", "fix PR issues", "what did the reviewer say", "review Bitbucket PR", or wants to validate GitHub or Bitbucket PR review comments, categorize findings, and optionally connect back into the Arness pipeline for fixes. Do NOT use this for creating PRs (use arn-code-ship) or reviewing implementation quality (use arn-code-review-implementation).
68
83%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description excels at trigger term coverage and distinctiveness, with an extensive list of natural user phrases and explicit negative boundaries referencing alternative skills. Its main weakness is that the actual capabilities (what the skill does) are somewhat secondary to the long list of trigger phrases, making the functional description less prominent. The description is functional and effective but could be more balanced between triggers and capability enumeration.
Suggestions
Restructure to lead with concrete capabilities (e.g., 'Fetches and validates GitHub/Bitbucket PR review comments, categorizes findings by severity, and optionally routes issues into the Arness pipeline for automated fixes.') before listing trigger phrases.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description mentions some concrete actions like 'validate PR review comments', 'categorize findings', and 'connect back into the Arness pipeline for fixes', but the bulk of the description is trigger phrases rather than a clear enumeration of specific capabilities. The 'what it does' is somewhat buried. | 2 / 3 |
Completeness | The description explicitly answers both 'what' (validate PR review comments, categorize findings, connect to Arness pipeline for fixes) and 'when' (extensive list of trigger phrases plus explicit 'Use when' and 'Do NOT use' guidance with alternative skill references). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'review PR', 'check PR comments', 'review PR feedback', 'fix PR issues', 'what did the reviewer say', 'review Bitbucket PR', plus numbered PR references like 'review PR 123'. These are highly natural phrases. | 3 / 3 |
Distinctiveness Conflict Risk | Very distinctive with explicit boundary-setting: 'Do NOT use this for creating PRs (use arn-code-ship) or reviewing implementation quality (use arn-code-review-implementation).' This clearly delineates the skill's niche and reduces conflict risk with related skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill with clear multi-step workflows, explicit validation checkpoints, and good error handling. Its main weakness is length — the dual-platform support creates some redundancy that could be better managed through reference files. The referenced bundle files are not provided, making it impossible to fully validate the progressive disclosure structure.
Suggestions
Consider extracting platform-specific details (GitHub vs Bitbucket command differences) into separate reference files to reduce the main SKILL.md length and redundancy.
The Error Handling section at the end largely duplicates guidance already given inline in each step — consider removing it or consolidating to avoid redundancy.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long but most content is necessary given the dual-platform (GitHub/Bitbucket) support and multi-step workflow. However, there's some redundancy in error handling (repeated inline and again in the Error Handling section), and some steps could be tightened. The explanations don't over-explain concepts Claude knows, but the overall length could be reduced. | 2 / 3 |
Actionability | The skill provides specific, executable CLI commands for both GitHub (`gh`) and Bitbucket (`bkt`), concrete bash examples, clear categorization tables, specific commit message formats, and detailed step-by-step instructions. The guidance is copy-paste ready and leaves little ambiguity about what to do. | 3 / 3 |
Workflow Clarity | The workflow is clearly sequenced (Steps 1-6) with explicit validation checkpoints: verifying platform availability, confirming PR identity, checking for stale PRs, validating fixes with tests (up to 3 retry attempts with revert on failure), and asking user confirmation before committing. The feedback loops for fix-verify-retry and the explicit error recovery paths are well-defined. | 3 / 3 |
Progressive Disclosure | The skill references external files (`pr-report-format.md`, `deferred-issue-template.md`, `testing-patterns.md`) which is good progressive disclosure, but no bundle files were provided to verify these exist. The main SKILL.md itself is quite long (~250 lines) and some sections like the detailed Error Handling list and the full Bitbucket parallel paths could potentially be split into reference files. The references are one-level deep and clearly signaled, which is positive. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
b9084b6
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.