gtfobins-validate

Validate shell builtins against GTFOBins attack patterns to ensure exploits are blocked by the sandbox

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The content is highly actionable with a clear, checkpoint-driven workflow and copy-paste Go test patterns. It is held back from top marks by mild verbosity and by being a long monolithic single file with no progressive disclosure of reference material.

Suggestions

Move the "Known GTFOBins attack patterns for current builtins" catalog into a `references/gtfobins-patterns.md` file and link to it, keeping SKILL.md as an overview.

Tighten or move the security banner and classification table to reduce inline length without losing the actionable workflow.

Dimension	Reasoning	Score
Conciseness	The body is mostly efficient with no basic-concept padding, but the security banner, the inline "Known GTFOBins attack patterns" reference catalog, and the classification table add length that could be tightened.	2 / 3
Actionability	Concrete file paths (e.g. `interp/builtins/builtin_<command>_pentest_test.go`), an executable `go test ./interp/... -run TestCmdGTFOBins -timeout 120s -v` command, and copy-paste-ready Go test templates with real assertions make the guidance fully actionable.	3 / 3
Workflow Clarity	Six explicitly sequenced steps include validation checkpoints: Step 5 runs tests and verifies, and the "Critical findings" section defines a stop-and-report loop with re-run after a fix, satisfying the feedback-loop requirement.	3 / 3
Progressive Disclosure	No bundle files exist (references/scripts/assets are absent), so this is a ~170-line single-file monolith; the inline "Known GTFOBins attack patterns" catalog is reference material that could be split into a separate file rather than kept inline.	2 / 3
	Total	10 / 12 Passed

Description

72%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is specific and distinctive with strong trigger terms, but it omits an explicit "Use when..." trigger clause, capping completeness. Adding when-to-use guidance would raise it to the top level.

Suggestions

Append an explicit trigger clause, e.g. "Use when auditing the shell's builtins for sandbox-escape risks or when adding a new builtin that needs GTFOBins coverage."

Optionally enumerate the concrete validation actions (classify techniques, generate pentest tests, produce a validation report) to lift specificity to 3.

Dimension	Reasoning	Score
Specificity	"Validate shell builtins against GTFOBins attack patterns to ensure exploits are blocked by the sandbox" names the domain and a concrete action, but it is a single composite action rather than a list of multiple distinct concrete actions.	2 / 3
Completeness	It clearly answers what the skill does, but there is no "Use when..." clause or equivalent explicit trigger guidance, which caps completeness at 2.	2 / 3
Trigger Term Quality	Natural, domain-specific terms a user would say are well covered: "Validate", "shell builtins", "GTFOBins", "attack patterns", "exploits", "sandbox".	3 / 3
Distinctiveness Conflict Risk	The GTFOBins-validation niche is highly specific and unlikely to trigger for or conflict with unrelated skills.	3 / 3
	Total	10 / 12 Passed

Validation

93%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 15 / 16 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	15 / 16 Passed

Repository: DataDog/rshell
Commit: a0a1140

Reviewed: about 16 hours ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.