CtrlK
BlogDocsLog inGet started
Tessl Logo

docker-sandbox

Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution. Use when running agent loops, spawning tool subprocesses, or any task requiring process isolation. Triggers on "sandbox", "isolated execution", "docker sandbox", "safe agent execution", or when working on agent loop infrastructure.

92

3.22x
Quality

88%

Does it follow best practices?

Impact

100%

3.22x

Average score across 3 eval scenarios

SecuritybySnyk

Risky

Do not use without reviewing

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong skill description that clearly articulates what the skill does (create/manage/execute agent tools in Docker sandboxes), when to use it (agent loops, subprocess spawning, process isolation), and includes explicit trigger terms. It uses proper third-person voice and is concise without being vague. The description is well-structured and would allow Claude to confidently select this skill from a large pool.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution.' This names specific tools, the container technology, and the purpose clearly.

3 / 3

Completeness

Clearly answers both 'what' (create, manage, execute agent tools in Docker sandboxes) and 'when' (explicit 'Use when' clause covering agent loops, spawning tool subprocesses, process isolation tasks, plus explicit trigger terms).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'sandbox', 'isolated execution', 'docker sandbox', 'safe agent execution', 'agent loop infrastructure', 'agent loops', 'tool subprocesses', 'process isolation'. These cover a good range of terms a user would naturally use.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche combining Docker sandboxes, agent tools (claude, codex), and isolated execution. The specificity of 'Docker sandbox' + 'agent loop infrastructure' makes it very unlikely to conflict with general Docker or general agent skills.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with excellent concrete commands and clear workflow sequencing for Docker sandbox management. Its main weakness is length—the implementation details (TypeScript interfaces), timing benchmarks, and detailed auth setup could be split into referenced files to improve progressive disclosure and conciseness. The troubleshooting section is well-done with specific diagnostic commands.

Suggestions

Move the TypeScript implementation section (utils.ts functions, spawnTool replacement) to a separate IMPLEMENTATION.md file and reference it from the main skill

Move the detailed auth setup procedures to an AUTH_SETUP.md file, keeping only a brief summary and link in the main skill

DimensionReasoningScore

Conciseness

The skill is fairly comprehensive and mostly efficient, but includes some sections that could be tightened—like the TypeScript interface definitions which are more design-doc than actionable skill content, and the timing tables which add nice context but aren't strictly necessary. It doesn't over-explain concepts Claude knows, but the overall length (~180 lines) could be trimmed.

2 / 3

Actionability

Excellent actionability throughout—concrete bash commands for every operation (create, exec, auth setup, network control, template saving), specific environment variable names, exact token formats, and copy-paste ready code blocks. The troubleshooting section provides specific diagnostic commands.

3 / 3

Workflow Clarity

The agent loop integration section provides a clear multi-step workflow (create → inject auth → exec per story → destroy), the pre-warm pattern is well-sequenced, and the fallback to host mode provides error recovery. The auth setup has clear sequential steps with validation (checking auth status after injection).

3 / 3

Progressive Disclosure

The skill has good section organization with clear headers, but it's a long monolithic file. The TypeScript implementation details and the full auth setup procedures could be split into separate reference files. The ADR link is a good external reference, but inline content like sandbox templates, timing tables, and implementation details bloat the main file.

2 / 3

Total

10

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
joelhooks/joelclaw
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.