Audits GitHub Actions workflows for security vulnerabilities in AI agent integrations including Claude Code Action, Gemini CLI, OpenAI Codex, and GitHub AI Inference. Detects attack vectors where attacker-controlled input reaches. AI agents running in CI/CD pipelines.
86
83%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Quality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, specific description that clearly identifies a narrow domain (security auditing of AI agents in GitHub Actions) with concrete capabilities and excellent trigger terms. Its main weakness is the lack of an explicit 'Use when...' clause, which would help Claude know precisely when to select this skill. The description is well-written in third person and avoids vague language.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when the user asks to review GitHub Actions workflows for security issues, audit CI/CD pipelines with AI agents, or check for prompt injection risks in automated workflows.'
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: audits GitHub Actions workflows, detects security vulnerabilities in AI agent integrations, names specific tools (Claude Code Action, Gemini CLI, OpenAI Codex, GitHub AI Inference), and identifies specific attack vectors involving attacker-controlled input reaching AI agents in CI/CD pipelines. | 3 / 3 |
Completeness | The 'what' is well-covered (audits workflows for security vulnerabilities, detects attack vectors), but there is no explicit 'Use when...' clause or equivalent trigger guidance telling Claude when to select this skill. The 'when' is only implied by the description of capabilities. | 2 / 3 |
Trigger Term Quality | Includes strong natural keywords users would say: 'GitHub Actions', 'security', 'vulnerabilities', 'CI/CD', 'Claude Code Action', 'Gemini CLI', 'OpenAI Codex', 'attack vectors', 'workflows'. These are terms a user concerned about pipeline security would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche: specifically targets security auditing of AI agent integrations in GitHub Actions workflows. The combination of CI/CD security + AI agents + named tools makes it very unlikely to conflict with other skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable security audit skill with excellent progressive disclosure and clear workflow sequencing. Its main weakness is verbosity — the 'Rationalizations to Reject' section, extensive 'When to Use/NOT to Use' lists, and some explanatory passages could be tightened without losing information. The concrete commands, structured tables, and clear step-by-step methodology make this immediately usable for its intended purpose.
Suggestions
Tighten the 'Rationalizations to Reject' section — each rationalization's explanation repeats the core point 2-3 times. Reduce each to a single sentence explaining why it's wrong plus one concrete example.
Condense the 'When to Use' / 'When NOT to Use' sections — several bullets overlap (e.g., multiple bullets about trigger events and data flow could be merged), and Claude can infer negative scope from positive scope.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is thorough but verbose in places. The 'When to Use' / 'When NOT to Use' sections, the 'Rationalizations to Reject' section, and the extensive severity judgment guidance add significant length. Some of this is genuinely novel domain knowledge Claude wouldn't have, but the framing and explanations could be tightened considerably (e.g., the rationalization explanations repeat the point multiple times). | 2 / 3 |
Actionability | The skill provides highly concrete, executable guidance: specific `gh api` commands with exact flags and jq expressions, precise action reference matching rules with a lookup table, exact field names to capture per action type, a structured vector detection table with clear quick-check criteria, and a detailed report format with section ordering. The methodology is copy-paste actionable at each step. | 3 / 3 |
Workflow Clarity | The 5-step methodology is clearly sequenced (Discover → Identify → Capture Context → Analyze Vectors → Report), with each step explicitly building on the previous. Validation checkpoints are present: Step 1 has an early exit if no workflows found, Step 2 exits if no AI actions found, Step 0 includes error handling for auth/404 failures, and Step 4 requires reading foundation references before analysis. The cross-file resolution has a depth limit to prevent infinite recursion. | 3 / 3 |
Progressive Disclosure | The skill masterfully uses progressive disclosure. The main SKILL.md provides the complete methodology overview with summary tables and quick-check columns, then delegates detailed per-vector detection heuristics, action profiles, cross-file resolution procedures, and foundation concepts to clearly referenced files (`{baseDir}/references/vector-{a..i}-*.md`, `action-profiles.md`, `foundations.md`, `cross-file-resolution.md`). All references are one level deep and clearly signaled. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
e32dae2
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.