GitHub operations via `gh` CLI: issues, PRs, CI runs, code review, API queries. Use when: (1) checking PR status or CI, (2) creating/commenting on issues, (3) listing/filtering PRs or issues, (4) viewing run logs. NOT for: complex web UI interactions requiring manual browser flows (use browser tooling when available), bulk operations across many repos (script with gh api), or when gh auth is not configured.
86
82%
Does it follow best practices?
Impact
98%
1.30xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that covers all dimensions thoroughly. It provides specific concrete actions, natural trigger terms developers would use, explicit 'Use when' and 'NOT for' clauses, and clear boundaries that distinguish it from related skills like browser tooling or scripting. The inclusion of negative boundaries is a particularly strong feature that goes beyond the typical good example.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: issues, PRs, CI runs, code review, API queries, plus detailed use cases like checking PR status, creating/commenting on issues, listing/filtering PRs, and viewing run logs. | 3 / 3 |
Completeness | Clearly answers both 'what' (GitHub operations via gh CLI: issues, PRs, CI runs, code review, API queries) and 'when' with explicit numbered trigger scenarios. Additionally includes a 'NOT for' section that clarifies boundaries, which is excellent for disambiguation. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'PR', 'issues', 'CI', 'code review', 'gh', 'run logs', 'PR status', 'commenting on issues'. These are all terms developers naturally use when working with GitHub. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive by specifying the `gh` CLI tool, GitHub-specific operations, and explicit exclusions (browser flows, bulk cross-repo operations, unauthenticated scenarios). The 'NOT for' clause further reduces conflict risk with browser or scripting skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, actionable reference skill with excellent concrete examples covering the major gh CLI operations. Its main weaknesses are the lack of validation/verification steps before destructive operations (merge, close) and some verbosity in the when-to-use/when-not-to-use sections that Claude could largely infer. The command examples are the strongest aspect, being fully executable and well-organized.
Suggestions
Add verification steps before destructive operations (e.g., 'Check PR status with `gh pr checks` before merging; verify CI passes before `gh pr merge`')
Trim the 'When to Use' and 'When NOT to Use' sections significantly—Claude can infer most of these boundaries from the skill description and commands
Consider splitting the Templates and API Queries sections into a separate EXAMPLES.md reference file to keep the main skill leaner
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient with concrete commands, but the 'When to Use' and 'When NOT to Use' sections are somewhat verbose for things Claude can infer. The setup section explaining one-time auth is borderline unnecessary. However, the command examples themselves are lean. | 2 / 3 |
Actionability | Every section provides fully executable, copy-paste ready commands with real flags and options. The examples cover all major use cases (PRs, issues, CI, API queries) with specific syntax including --json and --jq filtering patterns. | 3 / 3 |
Workflow Clarity | Commands are well-organized by category, but there are no validation checkpoints or error-handling guidance. For potentially destructive operations like merging PRs or closing issues, there's no verify-before-acting pattern. The templates section shows multi-step sequences but lacks validation steps. | 2 / 3 |
Progressive Disclosure | Content is well-structured with clear headers and logical grouping, but everything is inline in a single file. The templates and API query sections could be referenced as separate files. For a skill of this length (~120 lines of content), some splitting would improve scannability. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
af8bd5f
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.