docker-sandbox

Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution. Use when running agent loops, spawning tool subprocesses, or any task requiring process isolation. Triggers on "sandbox", "isolated execution", "docker sandbox", "safe agent execution", or when working on agent loop infrastructure.

3.22x

Quality

88%

Does it follow best practices?

Impact

100%

3.22x

Average score across 3 eval scenarios

Securityby

Risky

Do not use without reviewing

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates specific capabilities (creating and managing agent tools in Docker sandboxes), provides explicit 'Use when' guidance with multiple trigger scenarios, and includes natural trigger terms. The description is concise, uses third-person voice correctly, and occupies a distinct niche that minimizes conflict risk with other skills.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution.' This names specific tools, the container technology, and the purpose clearly.	3 / 3
Completeness	Clearly answers both 'what' (create, manage, execute agent tools in Docker sandboxes) and 'when' (explicit 'Use when' clause with multiple trigger scenarios, plus a 'Triggers on' list). Both dimensions are well-covered.	3 / 3
Trigger Term Quality	Includes strong natural trigger terms: 'sandbox', 'isolated execution', 'docker sandbox', 'safe agent execution', 'agent loop infrastructure', 'agent loops', 'tool subprocesses', 'process isolation'. Good coverage of terms a user would naturally use.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche combining Docker sandboxes, agent tools (claude, codex), and process isolation. Unlikely to conflict with general Docker skills or general agent skills due to the specific intersection of sandbox execution for agent loops.	3 / 3
	Total	12 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with excellent executable examples and clear workflow sequencing for Docker sandbox management. Its main weakness is length—the document packs auth setup, agent loop patterns, TypeScript implementation details, custom templates, and troubleshooting into a single file that would benefit from progressive disclosure via supporting bundle files. The content is mostly concise but includes some project-specific implementation details (utils.ts signatures) that inflate the token cost.

Suggestions

Split auth setup, agent loop integration, and implementation details (utils.ts) into separate referenced files to improve progressive disclosure and reduce the main SKILL.md token footprint.

Consider moving the TypeScript function signatures to a separate IMPLEMENTATION.md file, since they are project-specific implementation details rather than core sandbox usage instructions.

Dimension	Reasoning	Score
Conciseness	The skill is mostly efficient and avoids explaining basic concepts, but includes some sections that could be tightened—e.g., the 'Implementation in utils.ts' section with TypeScript function signatures is somewhat verbose and project-specific, and the timing tables, while useful, add bulk. The token refresh table and template table are concise and well-structured.	2 / 3
Actionability	The skill provides fully executable, copy-paste-ready bash commands throughout—sandbox creation, auth setup, exec commands, network control, template saving, and troubleshooting diagnostics. The TypeScript function signatures give concrete API shapes. Every section has specific, runnable examples.	3 / 3
Workflow Clarity	The agent loop integration section provides a clear multi-step workflow (create → inject auth → exec per story → destroy), the pre-warm pattern is well-sequenced with explicit lifecycle phases, and the fallback to host mode provides an error recovery path. The 'Replacing spawnTool()' section has a clear decision tree (check sandbox exists → exec or fallback). Troubleshooting section provides validation/diagnostic steps.	3 / 3
Progressive Disclosure	The content is well-structured with clear section headers and a logical progression from prerequisites to quick reference to detailed usage. However, the document is quite long (~200 lines of substantive content) and could benefit from splitting the auth setup, agent loop integration, and implementation details into separate referenced files. The ADR link is a good external reference, but no bundle files exist to offload detailed content.	2 / 3
	Total	10 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: joelhooks/joelclaw
Commit: 03f0a59

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.