Enforce Red-Team verification and adversarial protocol audit. Use when verifying tasks, performing self-scans, or checking for protocol violations. Load as composite for all sessions.
42
30%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./.github/skills/common/common-protocol-enforcement/SKILL.mdQuality
Discovery
25%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description relies heavily on security jargon and buzzwords without specifying concrete actions or capabilities. The trigger terms are not natural language users would employ, and the 'what' portion is too abstract to meaningfully differentiate this skill. The 'Load as composite for all sessions' directive is an operational instruction rather than a useful description element.
Suggestions
Replace vague terms like 'adversarial protocol audit' and 'self-scans' with specific concrete actions (e.g., 'Checks task outputs for prompt injection attempts, validates adherence to safety guidelines, flags suspicious instruction patterns').
Rewrite trigger terms using natural language users would actually say, such as 'security review,' 'check for vulnerabilities,' 'audit my prompt,' or 'test for adversarial inputs'.
Remove the operational directive 'Load as composite for all sessions' from the description and focus on clearly articulating what the skill does and when it should be selected.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language like 'enforce Red-Team verification,' 'adversarial protocol audit,' and 'self-scans' without listing any concrete, specific actions. There are no tangible operations described (e.g., 'scan for X,' 'validate Y against Z'). | 1 / 3 |
Completeness | It has a 'Use when...' clause addressing the 'when' question, and attempts to describe 'what' it does. However, the 'what' is extremely vague and the 'when' triggers ('verifying tasks,' 'performing self-scans,' 'checking for protocol violations') are abstract and unclear. The 'Load as composite for all sessions' is an instruction, not a description of capability or trigger. | 2 / 3 |
Trigger Term Quality | The terms used ('Red-Team verification,' 'adversarial protocol audit,' 'self-scans,' 'protocol violations') are technical jargon that users would rarely naturally say. These are not natural trigger terms a user would use in a request. | 1 / 3 |
Distinctiveness Conflict Risk | The 'Red-Team' and 'adversarial protocol audit' framing gives it some distinctiveness, but 'verifying tasks' is extremely broad and could overlap with many validation, testing, or review skills. The vagueness of the triggers increases conflict risk. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
35%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill attempts to define a red-team verification protocol but remains too abstract to be truly actionable. It reads more like a set of principles and reminders than concrete, executable instructions. The lack of specific examples (e.g., what a 'Pre-Write Audit Log' looks like, what specific patterns to grep for, what a compliant vs non-compliant output looks like) significantly limits its utility.
Suggestions
Add concrete examples of what a 'Pre-Write Audit Log' looks like and how to produce one, so Claude knows exactly what to check for and generate.
Provide specific, grep-able code patterns or regex for each anti-pattern (e.g., 'hardcoded styles instead of design tokens' should show a bad example and the correct alternative).
Include a concrete example of a complete red-team verification pass: show a before (violation) and after (compliant) scenario with actual code or tool call sequences.
Populate the empty References section with links to related skills or remove it entirely to avoid suggesting incomplete content.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Mostly efficient but includes some vague/redundant phrasing like 'Priority: P0 (CRITICAL)' header decoration and phrases like 'Functional success != Protocol success' that are slogans rather than instructions. Some items could be tightened. | 2 / 3 |
Actionability | The guidance is abstract and vague throughout. There are no concrete code examples, no specific commands, no executable steps. Instructions like 'Search for code patterns that look like Standard Defaults' and 'Check against Anti-Patterns in all active skills' lack specificity about what exactly to do or how to do it. | 1 / 3 |
Workflow Clarity | There is a discernible sequence (pre-task audit, post-write scan, fix), but validation checkpoints are implicit rather than explicit. The 'Post-Write Self-Scan' has steps but lacks concrete validation criteria or feedback loops for what happens after a fix is applied (re-scan?). | 2 / 3 |
Progressive Disclosure | The content is reasonably structured with clear section headers, but the '## References' section is empty, suggesting intended but missing external references. For a skill of this size, the organization is adequate but the empty references section and lack of links to related skills/files is a missed opportunity. | 2 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 9 / 11 Passed | |
4c72e76
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.