Resolves unresolved GitHub PR review threads end-to-end: evaluates whether each review comment is correct, applies a targeted fix when valid, replies with rationale when not, commits, and resolves the thread. USE FOR: unresolved review threads, PR review feedback, changes requested PRs, PR review URLs (#pullrequestreview-...), fix the review comments, close the open threads, address PR feedback. DO NOT USE FOR: summarizing feedback without code changes, creating new PRs, or read-only branches.
85
81%
Does it follow best practices?
Impact
93%
1.06xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly articulates a specific end-to-end workflow for resolving GitHub PR review threads. It includes comprehensive trigger terms via the 'USE FOR' clause, explicitly defines scope boundaries with 'DO NOT USE FOR', and lists concrete actions in the capability description. The description is well-structured, concise, and highly distinguishable from related skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: evaluates review comments for correctness, applies targeted fixes, replies with rationale, commits changes, and resolves threads. This is a comprehensive enumeration of the end-to-end workflow. | 3 / 3 |
Completeness | Clearly answers both 'what' (evaluates comments, applies fixes, replies with rationale, commits, resolves threads) and 'when' (explicit 'USE FOR' clause with trigger scenarios, plus a 'DO NOT USE FOR' exclusion list that further clarifies scope). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms: 'unresolved review threads', 'PR review feedback', 'changes requested PRs', 'PR review URLs', 'fix the review comments', 'close the open threads', 'address PR feedback'. These closely match what users would naturally say. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche around GitHub PR review thread resolution. The 'DO NOT USE FOR' clause explicitly excludes adjacent tasks like summarizing feedback or creating new PRs, reducing conflict risk with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
62%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a thorough, highly actionable skill with excellent workflow clarity and concrete executable guidance. Its main weakness is significant verbosity — the document tries to be both a quick-reference overview and a comprehensive manual simultaneously, resulting in a token-heavy file that explains many things Claude can infer. The progressive disclosure is decent with external asset references but the main file itself would benefit from splitting detailed reference material into separate files.
Suggestions
Cut the 'When to use' trigger list to 3-4 representative examples max — Claude can generalize from patterns without 10+ enumerated phrases.
Move the GraphQL cost awareness, token budget guidance, and detailed tooling map table into a separate REFERENCE.md or TOOLING.md file, keeping only a brief summary and link in the main SKILL.md.
Remove the 'When not to use' section or reduce to a single line — these are obvious negations Claude can infer from the positive use cases.
Consolidate the safety & constraints section to only non-obvious rules (e.g., 'never resolve without a reply', 'dry-run first') and drop items like 'never expose secrets' which Claude already knows.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~250+ lines. It over-explains trigger patterns (listing 10+ example phrases Claude doesn't need), includes extensive tooling maps, GraphQL cost guidance, and token budget advice that could be drastically condensed. Much of the content (e.g., 'PDF is a common file format'-equivalent explanations of when to use/not use) assumes Claude needs hand-holding on obvious contextual decisions. | 1 / 3 |
Actionability | The skill provides highly concrete, executable guidance: specific shell commands with real arguments, a clear tooling map with exact MCP tool names and GraphQL file paths, example script invocations with real flags (--owner, --repo, --pr, --review-id, --apply), and precise mutation ordering. The instructions are copy-paste ready and leave little ambiguity about what to execute. | 3 / 3 |
Workflow Clarity | The 9-phase workflow is clearly sequenced (fetch → analyze → fix → validate → push → reply → resolve → verify → request review) with explicit validation checkpoints: re-fetch thread state before retrying, confirm unresolved count drops to zero, re-check CI status, dry-run before apply. Feedback loops are well-defined (fix → validate → retry if errors; re-fetch on uncertain results; escalate to user if ambiguous). | 3 / 3 |
Progressive Disclosure | The skill references external assets (GraphQL files, scripts, checklist template) with clear paths, which is good progressive disclosure. However, the main SKILL.md itself is monolithic — the tooling map, GraphQL cost awareness, token budget guidance, and safety constraints sections could be split into separate reference files. The inline content is too dense for a single overview document. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 10 / 11 Passed | |
060e3af
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.