fix-ci-tests

Diagnose and fix CI failures on a GitHub PR by analyzing failing checks, reading logs, and applying fixes

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Content

92%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

A highly actionable, well-sequenced workflow with executable commands, strong validation checkpoints, and a feedback loop. Its only weakness is progressive disclosure: the long single-file body keeps detailed sub-procedures inline rather than offloading them to one-level-deep reference files.

Suggestions

Move the detailed fuzz-corpus and GraphQL review-comment-resolution procedures into reference files (e.g. references/fuzz-failures.md, references/review-comments.md) linked from the main workflow to reduce inline bulk.

Consolidate the repeated external-data/security guidance into the single opening callout to avoid restating it in step 2.

Dimension	Reasoning	Score
Conciseness	Action-dense throughout with executable commands and no padding explaining concepts Claude already knows; the only slight redundancy is the security/external-data note repeated in step 2, but it is a deliberate safety emphasis that earns its place.	3 / 3
Actionability	Provides fully executable, copy-paste-ready commands — gh pr checks, gh run view --log, go test -race invocations, git commit/push, and complete GraphQL queries — with specific flags and concrete examples.	3 / 3
Workflow Clarity	A 10-step sequenced workflow with explicit validation checkpoints (reproduce locally before fixing, verify all fixes) and a clear feedback loop ('If new failures appear, repeat from step 4') for the destructive commit/push operations.	3 / 3
Progressive Disclosure	Well-organized into clear numbered sections with no deep reference nesting, but it is a single monolithic file with no external references; detailed sub-workflows (fuzz corpus handling, GraphQL comment resolution) that could be split into reference files are inline instead.	2 / 3
	Total	11 / 12 Passed

Description

82%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

A concise, specific description that names concrete actions and natural trigger terms within a clear niche. Its main gap is the absence of an explicit 'Use when...' clause, which leaves the invocation trigger implied rather than stated.

Suggestions

Add an explicit trigger clause, e.g. 'Use when CI checks on a GitHub PR are failing or when the user asks to fix broken CI/red checks.'

Include common user phrasings like 'failing tests', 'broken CI', or 'red checks' alongside 'CI failures' for broader trigger coverage.

Dimension	Reasoning	Score
Specificity	Lists multiple concrete actions — 'analyzing failing checks, reading logs, and applying fixes' — each a distinct, specific operation rather than vague language.	3 / 3
Completeness	Clearly answers 'what' (diagnose and fix CI failures) but lacks an explicit 'Use when...' trigger clause, so 'when' is only implied; per the guidelines this caps completeness at 2.	2 / 3
Trigger Term Quality	Uses natural terms users actually say — 'CI failures', 'GitHub PR', 'failing checks', 'logs' — giving good coverage of how the need is expressed.	3 / 3
Distinctiveness Conflict Risk	Scoped to 'CI failures on a GitHub PR', a clear niche with distinct triggers unlikely to conflict with other skills.	3 / 3
	Total	11 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	15 / 16 Passed

Repository: DataDog/rshell
Commit: a0a1140

Reviewed: about 16 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.