Review real software repositories for likely security issues using the local OWASP Top 10:2025 category set, official OWASP-mapped CWE lists, and canonical MITRE CWE records. Use when auditing source code, configuration, IaC, pipelines, dependencies, auth flows, crypto, logging, or error handling, and when the goal is evidence-based findings mapped to both CWE and OWASP 2025 with confidence levels and coverage gaps.
72
88%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Risky
Do not use without reviewing
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly defines its purpose, methodology, and trigger conditions. It provides comprehensive coverage of natural trigger terms across multiple security audit domains, explicitly states both what the skill does and when to use it, and occupies a distinct niche that would be easily distinguishable from other skills.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description lists multiple specific concrete actions: reviewing repositories for security issues, using OWASP Top 10:2025 categories, OWASP-mapped CWE lists, and MITRE CWE records. It also specifies concrete outputs like evidence-based findings mapped to CWE and OWASP 2025 with confidence levels and coverage gaps. | 3 / 3 |
Completeness | Clearly answers both 'what' (review repositories for security issues using OWASP Top 10:2025, CWE lists, and MITRE CWE records) and 'when' (explicit 'Use when' clause covering auditing source code, configuration, IaC, pipelines, dependencies, auth flows, crypto, logging, error handling, and when the goal is evidence-based findings). | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'security issues', 'auditing source code', 'OWASP', 'CWE', 'auth flows', 'crypto', 'logging', 'error handling', 'dependencies', 'IaC', 'pipelines', 'configuration'. These are terms a user performing a security audit would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with a clear niche: OWASP Top 10:2025-based security auditing with CWE mapping. The specificity of the methodology (OWASP 2025, CWE mapping, confidence levels, coverage gaps) and the enumerated audit targets make it very unlikely to conflict with generic code review or other security-adjacent skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted, highly actionable security review skill with a clear multi-step workflow, precise decision criteria, and a complete output contract. Its main weakness is moderate verbosity—several sections overlap (precision controls, output rules, non-negotiable behavior) and the category review focus largely duplicates what the referenced detection_playbook.md should provide. The invocation scope and guardrails are excellent for safety and bounding.
Suggestions
Consolidate the overlapping 'Precision Controls', 'Output Rules', and 'Non-Negotiable Runtime Behavior' sections into a single 'Guardrails' section to reduce redundancy and save ~30 lines.
Consider moving the Category Review Focus section entirely to detection_playbook.md (which it already references) and replacing it with a single line pointing there, since the detailed per-category checklist is duplicative.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is thorough and mostly earns its length given the complexity of a full OWASP security review workflow, but there is notable redundancy—precision controls, output rules, and non-negotiable runtime behavior overlap significantly, and some sections restate what Claude would naturally do (e.g., 'do not make unsupported exploitability claims'). The category review focus section largely restates what the detection_playbook.md reference should cover. | 2 / 3 |
Actionability | The skill provides highly concrete, executable guidance: specific file paths to consult, exact status labels and their definitions, a complete output template with field-level structure, precise severity/confidence rubrics, and detailed guardrails with specific examples (e.g., 'generated code, fixtures, mocks... are not production findings unless wired into shipped behavior'). Every step tells Claude exactly what to do. | 3 / 3 |
Workflow Clarity | The six-step workflow is clearly sequenced with logical dependencies (detect context → build inventory → generate candidates → map to OWASP → enforce precision → prioritize). Validation is embedded throughout via precision controls (Step 5) that act as explicit checkpoints before a finding can be emitted, and the status rules in Step 3 provide a clear decision framework for evidence quality. | 3 / 3 |
Progressive Disclosure | The skill references external files well (knowledge/*.json, detection_playbook.md, schemas/finding.schema.json, builder.md) with clear navigation signals, but the SKILL.md itself is quite long and the Category Review Focus section duplicates what detection_playbook.md should contain. Some content (like the full output template and detailed guardrails) could potentially be split into referenced files for better organization, though no bundle files were provided to verify the referenced structure exists. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_version | 'metadata.version' is missing | Warning |
Total | 10 / 11 Passed | |
e017614
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.