Guides controlled exploitation of validated vulnerabilities to measure real-world impact. Use when the user requests proof-of-concept validation, privilege escalation testing, or attack path confirmation in an authorized environment.
88
83%
Does it follow best practices?
Impact
96%
1.50xAverage score across 3 eval scenarios
Passed
No known issues
Quality
Discovery
89%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a well-crafted description with a clear 'Use when' clause, strong domain-specific trigger terms, and a distinct niche in offensive security testing. The main weakness is that the 'what' portion could be slightly more specific about the concrete actions performed (e.g., payload crafting, lateral movement simulation, credential harvesting) rather than the somewhat abstract 'guides controlled exploitation to measure real-world impact.'
Suggestions
Add 2-3 more concrete actions to the 'what' clause, such as 'crafts payloads, simulates lateral movement, tests credential harvesting' to increase specificity.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (exploitation/vulnerability testing) and some actions (proof-of-concept validation, privilege escalation testing, attack path confirmation), but the primary clause 'guides controlled exploitation of validated vulnerabilities to measure real-world impact' is somewhat abstract rather than listing multiple concrete discrete actions. | 2 / 3 |
Completeness | Clearly answers both 'what' (guides controlled exploitation of validated vulnerabilities to measure real-world impact) and 'when' (explicit 'Use when' clause specifying proof-of-concept validation, privilege escalation testing, or attack path confirmation in an authorized environment). | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms that a user in this domain would actually say: 'proof-of-concept validation', 'privilege escalation testing', 'attack path confirmation', 'exploitation', 'validated vulnerabilities', and 'authorized environment'. These cover the key phrases a penetration tester would use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche focused specifically on exploitation and post-validation attack confirmation, clearly distinguishable from vulnerability scanning, reporting, or general security skills. The terms 'exploitation', 'privilege escalation', and 'attack path confirmation' carve out a clear niche. | 3 / 3 |
Total | 11 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, concise skill for guiding controlled exploitation during penetration testing. Its strengths are clear workflow sequencing with safety boundaries and a useful output template. Its main weakness is the lack of concrete, executable examples (specific tool commands or sample exploit scenarios) that would make the guidance more immediately actionable.
Suggestions
Add 1-2 concrete examples with specific tool commands (e.g., a Metasploit module usage, a curl command for a web vuln) to demonstrate what 'minimal payloads' and 'controlled validation' look like in practice.
Add cross-references to related skills or documents (e.g., scanning/enumeration phase, post-exploitation, rules of engagement template) to improve progressive disclosure and workflow continuity.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is lean and efficient. It avoids explaining what penetration testing is or how exploits work—concepts Claude already knows. Every section serves a clear purpose with no padding. | 3 / 3 |
Actionability | The workflow provides clear procedural guidance and the output template is concrete and copy-paste ready. However, there are no executable code examples, specific tool commands (e.g., Metasploit, sqlmap), or concrete exploit examples—the guidance remains at the instructional level without specific technical commands. | 2 / 3 |
Workflow Clarity | The 5-step workflow is clearly sequenced with explicit safety checkpoints (confirm preconditions and rollback plan, define PoC boundaries before execution, assess blast radius after exploitation). The quality checks section serves as a validation checklist, and the workflow includes containment recommendations as a feedback mechanism. | 3 / 3 |
Progressive Disclosure | The content is well-structured with clear sections and an output template, but it's entirely self-contained with no references to related materials (e.g., scanning phase skill, post-exploitation skill, rules of engagement templates). For a skill that references 'prioritized, in-scope findings from scanning' and 'handoff,' links to adjacent skills would improve navigation. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
9976e81
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.