docker-sandbox

Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution. Use when running agent loops, spawning tool subprocesses, or any task requiring process isolation. Triggers on "sandbox", "isolated execution", "docker sandbox", "safe agent execution", or when working on agent loop infrastructure.

3.22x

Quality

88%

Does it follow best practices?

Impact

100%

3.22x

Average score across 3 eval scenarios

Securityby

Risky

Do not use without reviewing

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with concrete commands and clear workflow sequencing for Docker sandbox management. Its main weakness is length—the auth setup details, TypeScript implementation stubs, and troubleshooting could be split into referenced files to improve progressive disclosure and reduce token cost. The content is well-structured but would benefit from being trimmed or modularized.

Suggestions

Move the 'Implementation in utils.ts' TypeScript section to a separate referenced file (e.g., IMPLEMENTATION.md) since it's implementation detail rather than operational guidance.

Move the detailed 'Auth Setup (One-Time)' section to a separate AUTH.md file and keep only a brief summary with a link in the main skill.

Dimension	Reasoning	Score
Conciseness	The skill is mostly efficient and avoids explaining basic concepts, but includes some sections that could be tightened—the TypeScript interface stubs in 'Implementation in utils.ts' are somewhat verbose for a skill file, and the timing tables, while useful, add bulk. The token refresh table and template table are concise and well-formatted.	2 / 3
Actionability	The skill provides fully executable, copy-paste-ready bash commands for every operation (create, exec, auth setup, network control, troubleshooting). The code examples are concrete with real flags, real environment variable names, and real command patterns rather than pseudocode.	3 / 3
Workflow Clarity	The agent loop integration section provides a clear multi-step workflow (create → inject auth → exec per story → destroy), the pre-warm pattern is well-sequenced with explicit lifecycle phases, and the fallback to host mode provides error recovery. The troubleshooting section serves as a validation/debugging checkpoint for common failure modes.	3 / 3
Progressive Disclosure	The skill references an external ADR and organizes content into logical sections with clear headers, but it's quite long (~200 lines) with the TypeScript implementation details and detailed auth setup that could be split into separate reference files. No bundle files are provided, so all content is inline in a single monolithic document.	2 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates specific capabilities (creating and managing agent tools in Docker sandboxes), provides explicit trigger guidance with both a 'Use when' clause and a 'Triggers on' list, and occupies a distinct niche. The description is concise, uses third-person voice, and includes natural keywords that users would employ when needing this functionality.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution.' This names specific tools, the container technology, and the purpose clearly.	3 / 3
Completeness	Clearly answers both 'what' (create, manage, execute agent tools in Docker sandboxes) and 'when' (explicit 'Use when' clause with multiple trigger scenarios, plus a 'Triggers on' clause listing specific keywords).	3 / 3
Trigger Term Quality	Includes strong natural trigger terms: 'sandbox', 'isolated execution', 'docker sandbox', 'safe agent execution', 'agent loop infrastructure', 'agent loops', 'tool subprocesses', 'process isolation'. These cover a good range of terms a user would naturally use.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive niche combining Docker sandboxes, agent tools (claude, codex), and process isolation. Unlikely to conflict with general Docker skills or general agent skills due to the specific intersection of sandbox execution for agent loops.	3 / 3
	Total	12 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: joelhooks/joelclaw
Commit: 2ca3686

Reviewed: 10 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.