Validate shell builtins against GTFOBins attack patterns to ensure exploits are blocked by the sandbox
58
67%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./.claude/skills/gtfobins-validate/SKILL.mdQuality
Discovery
57%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description targets a clear and distinctive niche (GTFOBins attack pattern validation for shell builtins in a sandbox), which makes it unlikely to conflict with other skills. However, it lacks an explicit 'Use when...' clause and could benefit from listing more concrete actions and natural trigger terms that users might employ when needing this capability.
Suggestions
Add an explicit 'Use when...' clause, e.g., 'Use when checking shell commands for GTFOBins exploits, auditing sandbox security, or validating that builtins cannot be used for privilege escalation.'
Include additional natural trigger terms users might say, such as 'privilege escalation', 'security audit', 'command injection', 'LOLBAS', or 'shell escape'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description names a specific domain (shell builtins, GTFOBins attack patterns, sandbox) and a core action (validate/ensure exploits are blocked), but it doesn't list multiple concrete actions—it's essentially one action described with context. | 2 / 3 |
Completeness | The 'what' is reasonably clear (validate shell builtins against GTFOBins patterns), but there is no explicit 'Use when...' clause or equivalent trigger guidance, which caps this at 2 per the rubric guidelines. | 2 / 3 |
Trigger Term Quality | Includes relevant technical keywords like 'shell builtins', 'GTFOBins', 'sandbox', and 'exploit', which are terms a security-minded user might use. However, it misses common variations like 'privilege escalation', 'security audit', 'LOLBAS', or 'command injection' that users might naturally say. | 2 / 3 |
Distinctiveness Conflict Risk | This is a very specific niche—GTFOBins validation of shell builtins against sandbox restrictions. It is highly unlikely to conflict with other skills due to the narrow and specialized domain. | 3 / 3 |
Total | 9 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a strong, highly actionable security validation skill with a clear multi-step workflow, explicit validation checkpoints, and executable test patterns. Its main weakness is moderate verbosity — some information is repeated across sections (classification table vs. known patterns vs. notes), and the document could benefit from splitting reference material into separate files. The security preamble addressing prompt injection is well-justified for the threat model.
Suggestions
Extract the 'Known GTFOBins attack patterns for current builtins' section into a separate reference file (e.g., KNOWN_PATTERNS.md) to reduce the main skill's length and improve progressive disclosure.
Consolidate the Notes section with the classification table in Step 3 to eliminate redundant explanations of why SUID/Sudo/Shell/Write attacks are N/A.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably well-structured but includes some redundant content. The 'Known GTFOBins attack patterns' section partially duplicates what the workflow already describes, and the Notes section restates information from the classification table. The security preamble is justified given the threat model but is somewhat verbose. | 2 / 3 |
Actionability | The skill provides fully executable Go test patterns, specific file paths, exact bash commands to run tests, concrete flag examples (e.g., `-c-0`, `--files0-from`), and clear naming conventions. The test templates are copy-paste ready with only command-specific substitutions needed. | 3 / 3 |
Workflow Clarity | The 6-step workflow is clearly sequenced with explicit validation checkpoints: Step 5 runs tests and verifies, Step 6 generates a report, and the 'Critical findings' section defines a stop-and-report feedback loop for exploitable techniques. The classification table in Step 3 provides clear decision criteria for each category. | 3 / 3 |
Progressive Disclosure | The content is well-organized with clear sections and headers, but it's a monolithic document (~180 lines of substantive content) with no references to external files. The known attack patterns section and detailed test templates could be split into separate reference files. However, no bundle files exist to support such splitting, which limits the score. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
00bdc03
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.