Diagnose and fix CI failures on a GitHub PR by analyzing failing checks, reading logs, and applying fixes
61
72%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/fix-ci-tests/SKILL.mdQuality
Discovery
67%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description is specific about its capabilities and occupies a clear niche around CI failure diagnosis on GitHub PRs. Its main weakness is the lack of an explicit 'Use when...' clause and missing some common trigger term variations that users might naturally use (e.g., 'build broken', 'pipeline failed', 'GitHub Actions').
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when a GitHub PR has failing checks, CI/CD pipeline errors, or when the user mentions build failures or test failures.'
Include additional natural trigger terms like 'pipeline failed', 'build broken', 'tests failing', 'GitHub Actions', 'CI/CD', and 'pull request' to improve keyword coverage.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'diagnose and fix CI failures', 'analyzing failing checks', 'reading logs', and 'applying fixes'. These are clear, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers 'what does this do' (diagnose and fix CI failures by analyzing checks, reading logs, applying fixes) but lacks an explicit 'Use when...' clause specifying when Claude should select this skill. Per the rubric, a missing 'Use when...' clause caps completeness at 2. | 2 / 3 |
Trigger Term Quality | Includes good terms like 'CI failures', 'GitHub PR', 'failing checks', and 'logs', but misses common user variations like 'pipeline failed', 'build broken', 'tests failing', 'CI/CD', 'GitHub Actions', or 'pull request'. | 2 / 3 |
Distinctiveness Conflict Risk | The combination of 'CI failures', 'GitHub PR', 'failing checks', and 'logs' creates a clear niche that is unlikely to conflict with other skills. This is a well-defined, specific domain. | 3 / 3 |
Total | 10 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable skill with excellent workflow clarity and concrete executable commands for every failure category. Its main weakness is length—the document packs a lot of repo-specific detail (GraphQL mutations, CI job tables, fuzz corpus paths) into a single file that would benefit from progressive disclosure via supporting files. The security callout is valuable but slightly verbose.
Suggestions
Extract the GraphQL comment-resolution queries and the CI job mapping table into separate reference files (e.g., RESOLVE_COMMENTS.md, CI_JOBS.md) to reduce the main skill's token footprint.
Tighten the security warning block—a 2-line callout would suffice since Claude understands prompt injection risks; the current version is ~8 lines with redundant phrasing.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long (~250 lines) and includes some content that could be tightened—e.g., the detailed CI job table is repo-specific and useful, but the extensive GraphQL examples for resolving comments and the repeated security warnings add bulk. However, most content is genuinely instructive and not explaining things Claude already knows. | 2 / 3 |
Actionability | The skill provides fully executable bash and Go commands throughout, with specific flags, environment variables, and concrete examples for every failure category. Commands are copy-paste ready with clear placeholders. | 3 / 3 |
Workflow Clarity | The 10-step workflow is clearly sequenced with explicit validation checkpoints (step 4: reproduce locally, step 7: verify all fixes with specific commands, feedback loop 'if new failures appear, repeat from step 4'). Destructive operations are guarded and the commit step explicitly warns against `git add -A`. | 3 / 3 |
Progressive Disclosure | The skill is a monolithic document with all content inline. While it references the 'fix-tests' skill for bash comparison failures, the extensive GraphQL queries, the full CI job mapping table, and the detailed fuzz fix workflow could be split into separate reference files. For a skill of this complexity, the single-file approach makes it harder to navigate. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
00bdc03
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.