CtrlK
BlogDocsLog inGet started
Tessl Logo

docker-sandbox

Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution. Use when running agent loops, spawning tool subprocesses, or any task requiring process isolation. Triggers on "sandbox", "isolated execution", "docker sandbox", "safe agent execution", or when working on agent loop infrastructure.

92

3.22x
Quality

88%

Does it follow best practices?

Impact

100%

3.22x

Average score across 3 eval scenarios

SecuritybySnyk

Risky

Do not use without reviewing

SKILL.md
Quality
Evals
Security

Quality

Content

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with concrete commands and clear workflow sequencing for Docker sandbox management. Its main weakness is length—the auth setup details, TypeScript implementation stubs, and troubleshooting could be split into referenced files to improve progressive disclosure and reduce token cost. The content is well-structured but would benefit from being trimmed or modularized.

Suggestions

Move the 'Implementation in utils.ts' TypeScript section to a separate referenced file (e.g., IMPLEMENTATION.md) since it's implementation detail rather than operational guidance.

Move the detailed 'Auth Setup (One-Time)' section to a separate AUTH.md file and keep only a brief summary with a link in the main skill.

DimensionReasoningScore

Conciseness

The skill is mostly efficient and avoids explaining basic concepts, but includes some sections that could be tightened—the TypeScript interface stubs in 'Implementation in utils.ts' are somewhat verbose for a skill file, and the timing tables, while useful, add bulk. The token refresh table and template table are concise and well-formatted.

2 / 3

Actionability

The skill provides fully executable, copy-paste-ready bash commands for every operation (create, exec, auth setup, network control, troubleshooting). The code examples are concrete with real flags, real environment variable names, and real command patterns rather than pseudocode.

3 / 3

Workflow Clarity

The agent loop integration section provides a clear multi-step workflow (create → inject auth → exec per story → destroy), the pre-warm pattern is well-sequenced with explicit lifecycle phases, and the fallback to host mode provides error recovery. The troubleshooting section serves as a validation/debugging checkpoint for common failure modes.

3 / 3

Progressive Disclosure

The skill references an external ADR and organizes content into logical sections with clear headers, but it's quite long (~200 lines) with the TypeScript implementation details and detailed auth setup that could be split into separate reference files. No bundle files are provided, so all content is inline in a single monolithic document.

2 / 3

Total

10

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates specific capabilities (creating and managing agent tools in Docker sandboxes), provides explicit trigger guidance with both a 'Use when' clause and a 'Triggers on' list, and occupies a distinct niche. The description is concise, uses third-person voice, and includes natural keywords that users would employ when needing this functionality.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution.' This names specific tools, the container technology, and the purpose clearly.

3 / 3

Completeness

Clearly answers both 'what' (create, manage, execute agent tools in Docker sandboxes) and 'when' (explicit 'Use when' clause with multiple trigger scenarios, plus a 'Triggers on' clause listing specific keywords).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'sandbox', 'isolated execution', 'docker sandbox', 'safe agent execution', 'agent loop infrastructure', 'agent loops', 'tool subprocesses', 'process isolation'. These cover a good range of terms a user would naturally use.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche combining Docker sandboxes, agent tools (claude, codex), and process isolation. Unlikely to conflict with general Docker skills or general agent skills due to the specific intersection of sandbox execution for agent loops.

3 / 3

Total

12

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
joelhooks/joelclaw
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.