When the user needs a security assessment — threat modeling, vulnerability review, auth flow audit, dependency scanning, or says "is this secure", "review for vulnerabilities", "threat model", "security audit", "pen test prep".
83
80%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Risky
Do not use without reviewing
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/security-review/SKILL.mdQuality
Discovery
82%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description excels at trigger term coverage and distinctiveness, providing excellent natural-language phrases users would actually say. However, it leans heavily toward describing 'when' to use the skill while underspecifying 'what' the skill concretely does — it lists assessment types but doesn't describe the outputs or actions performed (e.g., 'generates threat models', 'produces vulnerability reports'). The structure is inverted from the ideal pattern of 'what it does' followed by 'use when'.
Suggestions
Add explicit 'what' statements describing concrete outputs, e.g., 'Performs security assessments including threat models, vulnerability reports, and remediation recommendations.'
Restructure to follow the 'what then when' pattern: lead with capabilities/actions, then follow with 'Use when...' trigger guidance.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: threat modeling, vulnerability review, auth flow audit, dependency scanning. These are distinct, well-defined security assessment activities. | 3 / 3 |
Completeness | The 'when' is very well covered with explicit trigger phrases, but the 'what' (what the skill actually does/produces) is only implied through the list of assessment types. It doesn't clearly state what actions the skill performs or what output it generates — it describes when to use it more than what it does. | 2 / 3 |
Trigger Term Quality | Excellent coverage of natural trigger terms users would say: 'is this secure', 'review for vulnerabilities', 'threat model', 'security audit', 'pen test prep'. These are highly natural phrases a user would actually type. | 3 / 3 |
Distinctiveness Conflict Risk | Security assessment is a clear niche with distinct trigger terms like 'threat model', 'security audit', 'pen test prep', and 'vulnerability review'. These are unlikely to conflict with general code review or other development skills. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a comprehensive and highly actionable security review skill with a well-defined five-phase workflow, specific tooling commands, and a clear output template with a concrete example. Its main weakness is length — several reference sections (OWASP Top 10, STRIDE details, CVSS guide) contain information Claude already knows and could either be trimmed significantly or moved to separate reference files. The workflow sequencing and validation steps are strong.
Suggestions
Move the OWASP Top 10 Checks, STRIDE details, and CVSS Scoring Guide to a separate REFERENCE.md file and link to it, keeping SKILL.md focused on the workflow and output format.
Trim explanatory text from framework sections — Claude already knows what STRIDE categories and OWASP items mean; focus only on the specific checks and thresholds unique to this skill's context.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is generally well-structured but includes some content Claude already knows (OWASP Top 10 descriptions, basic CVSS definitions, what STRIDE categories mean). The checklists and framework sections could be more concise, as Claude is already familiar with these security concepts. However, the specific tool commands and thresholds add genuine value. | 2 / 3 |
Actionability | Provides specific, executable commands (semgrep, npm audit, pip-audit, trivy, govulncheck), concrete code examples in the output snippet, specific configuration values (argon2id cost >= 10, 15-min access tokens, 5 attempts/15 min rate limiting), and a complete output template with exact formatting. The example output demonstrates precisely what a good finding looks like. | 3 / 3 |
Workflow Clarity | The five-phase workflow is clearly sequenced with explicit ordering constraints (automated scanning before manual review, authorization before active testing). Phase 4 includes validation of automated findings to eliminate false positives, creating a feedback loop. The mandatory constraints section reinforces sequencing requirements. Remediation priority provides clear timelines. | 3 / 3 |
Progressive Disclosure | The skill references related skills (code-review, architecture-design, soc2-prep) and startup-context, which is good. However, the OWASP Top 10, STRIDE details, Auth Flow Checklist, and CVSS scoring guide are all inline, making this a lengthy document. These reference sections could be split into separate files with links from the main skill, keeping SKILL.md as a leaner overview. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
4ad31b4
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.