GitHub operations via `gh` CLI: issues, PRs, CI runs, code review, API queries. Use when: (1) checking PR status or CI, (2) creating/commenting on issues, (3) listing/filtering PRs or issues, (4) viewing run logs. NOT for: complex web UI interactions requiring manual browser flows (use browser tooling when available), bulk operations across many repos (script with gh api), or when gh auth is not configured.
86
82%
Does it follow best practices?
Impact
98%
1.30xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that covers all dimensions well. It provides specific concrete actions, natural trigger terms, explicit 'Use when' and 'NOT for' clauses, and clear boundaries that distinguish it from related skills. The inclusion of negative boundaries (NOT for) is a particularly strong feature that helps Claude avoid misselecting this skill.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: issues, PRs, CI runs, code review, API queries, checking PR status, creating/commenting on issues, listing/filtering PRs, viewing run logs. Very comprehensive. | 3 / 3 |
Completeness | Clearly answers both 'what' (GitHub operations via gh CLI: issues, PRs, CI runs, code review, API queries) and 'when' with explicit numbered trigger scenarios. Also includes a 'NOT for' section that further clarifies boundaries. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'PR status', 'CI', 'issues', 'code review', 'run logs', 'gh CLI', 'PRs', 'commenting on issues'. These are all terms users naturally use when working with GitHub. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive by specifying 'gh CLI' as the tool and GitHub-specific operations. The 'NOT for' section explicitly delineates boundaries against browser tooling and bulk scripting, reducing conflict risk with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
64%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a solid, practical skill with excellent actionability—every section provides real, executable commands. The main weaknesses are the lack of validation/verification steps for destructive operations (merge, close) and some verbosity in the when-to-use/when-not-to-use sections. The content could be tightened and would benefit from explicit 'check before acting' patterns for operations like PR merges.
Suggestions
Add verification steps before destructive operations, e.g., 'Run `gh pr view` to confirm state before `gh pr merge`' to create feedback loops
Trim the 'When to Use' and 'When NOT to Use' sections to a compact bullet list or remove them entirely since the description/frontmatter already covers this
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient but includes some unnecessary sections like 'When to Use' and 'When NOT to Use' that are somewhat verbose for what could be conveyed more briefly. The setup section explaining one-time auth is borderline unnecessary. However, the command examples themselves are lean and well-structured. | 2 / 3 |
Actionability | Every section provides concrete, copy-paste ready commands with real flags and options. The examples cover all major use cases (PRs, issues, CI, API queries) with executable bash commands, JSON output filtering with jq, and practical templates. | 3 / 3 |
Workflow Clarity | Commands are well-organized by category but there are no validation checkpoints or feedback loops. For potentially destructive operations like merging PRs or closing issues, there's no guidance to verify state first. The templates section shows multi-step patterns but lacks explicit validation steps. | 2 / 3 |
Progressive Disclosure | The content is well-structured with clear headers and logical grouping, but everything is inline in a single file. The templates and API query sections could be split into separate reference files. For a skill of this length (~100+ lines), some content could benefit from being referenced rather than included directly. | 2 / 3 |
Total | 9 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
09cce3e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.