Create, manage, and execute agent tools (claude, codex) inside Docker sandboxes for isolated code execution. Use when running agent loops, spawning tool subprocesses, or any task requiring process isolation. Triggers on "sandbox", "isolated execution", "docker sandbox", "safe agent execution", or when working on agent loop infrastructure.
92
88%
Does it follow best practices?
Impact
100%
3.22xAverage score across 3 eval scenarios
Risky
Do not use without reviewing
Isolated execution of claude, codex, and other agent tools using Docker Desktop's docker sandbox (v0.11.0+). Uses existing Claude Max and ChatGPT Pro subscriptions — no API key billing.
ADR: ADR-0023
docker sandbox version returns ≥0.11.0agent-secrets:
claude_setup_token — from claude setup-token (1-year token, Max subscription)codex_auth_json — contents of ~/.codex/auth.json (ChatGPT Pro subscription)# Create a sandbox
docker sandbox create --name my-sandbox claude /path/to/project
# Run a command in it
docker sandbox exec -e "CLAUDE_CODE_OAUTH_TOKEN=..." -w /path/to/project my-sandbox \
claude -p "implement the feature" --output-format text --dangerously-skip-permissions
# List sandboxes
docker sandbox ls
# Remove
docker sandbox rm my-sandboxRun interactively on the host (needs browser for OAuth):
claude setup-tokenThis opens a browser, completes OAuth, and prints a token like sk-ant-oat01-.... Valid for 1 year.
Store it:
secrets add claude_setup_token --value "sk-ant-oat01-..."Use in sandbox:
TOKEN=$(secrets lease claude_setup_token --ttl 1h --raw)
docker sandbox exec -e "CLAUDE_CODE_OAUTH_TOKEN=$TOKEN" my-sandbox claude auth status
# → loggedIn: true, authMethod: oauth_tokenAuthenticate codex locally (needs browser):
codex # Select "Sign in with ChatGPT", complete OAuthThe auth file at ~/.codex/auth.json is portable (not host-tied). Store it:
secrets add codex_auth_json --value "$(cat ~/.codex/auth.json)"Inject into sandbox:
AUTH=$(secrets lease codex_auth_json --ttl 1h --raw)
docker sandbox exec my-sandbox bash -c "mkdir -p ~/.codex && cat > ~/.codex/auth.json << 'EOF'
${AUTH}
EOF"| Token | Lifetime | Refresh |
|---|---|---|
claude_setup_token | 1 year | Run claude setup-token again, update secret |
codex_auth_json | Until subscription change | Re-run codex login if auth fails, update secret |
Create sandbox(es) at loop start, reuse for all stories, destroy at loop end.
PLANNER (loop start)
├── docker sandbox create --name loop-{loopId}-claude claude {workDir}
├── docker sandbox create --name loop-{loopId}-codex codex {workDir} # if needed
└── inject auth into both
IMPLEMENTOR / TEST-WRITER / REVIEWER (per story)
└── docker sandbox exec -w {workDir} -e CLAUDE_CODE_OAUTH_TOKEN=... loop-{loopId}-{tool} \
{tool command}
# ~90ms overhead, workspace changes visible on host immediately
COMPLETE / CANCEL (loop end)
├── docker sandbox rm loop-{loopId}-claude
└── docker sandbox rm loop-{loopId}-codex| Operation | Time |
|---|---|
| Create (cached image) | ~14s |
| Exec (warm sandbox) | ~90ms |
| Stop | ~11s |
| Remove | ~150ms |
Net overhead per loop: ~14s create + ~90ms × N stories = negligible for loops running 5-10 stories at 5-15min each.
The workspace is bidirectional — same path on host and in sandbox:
| Template | Tools Included |
|---|---|
claude | claude 2.1.42, git, node 20, npm |
codex | codex 0.101.0, git, node 20, npm |
Neither includes bun. If bun is needed, use host-mode fallback or install it post-create.
Pass via docker sandbox exec -e:
docker sandbox exec \
-e "CLAUDE_CODE_OAUTH_TOKEN=$TOKEN" \
-e "NODE_ENV=development" \
-w /path/to/project \
my-sandbox \
claude -p "prompt" --output-format text --dangerously-skip-permissionsSandboxes have network access by default. Restrict with proxy rules:
# Allow only API endpoints
docker sandbox network proxy my-sandbox --policy deny
docker sandbox network proxy my-sandbox --allow-host api.anthropic.com
docker sandbox network proxy my-sandbox --allow-host api.openai.comIf Docker is unavailable:
# Check availability
docker info >/dev/null 2>&1 || echo "Docker not available"
# Force host mode
export AGENT_LOOP_HOST=1If you install additional tools in a sandbox, save it as a template:
# Install tools
docker sandbox exec my-sandbox bash -c 'npm i -g @anthropic-ai/claude-code @openai/codex'
# Save as template
docker sandbox save my-sandbox my-agent-template:v1
# Use the template for future sandboxes
docker sandbox create --name fast-sandbox -t my-agent-template:v1 claude /path/to/project// Create sandbox for a loop
async function createLoopSandbox(
loopId: string,
tool: "claude" | "codex",
workDir: string
): Promise<string> // returns sandbox name
// Execute command in existing sandbox
async function execInSandbox(
sandboxName: string,
command: string[],
opts: { env?: Record<string, string>; workDir?: string; timeout?: number }
): Promise<{ exitCode: number; output: string }>
// Destroy loop sandbox(es)
async function destroyLoopSandbox(loopId: string): Promise<void>Current spawnTool() in implement.ts checks AGENT_LOOP_HOST and isDockerAvailable(). Update it to:
loop-{loopId}-{tool} exists (created by planner)execInSandbox() with auth env varsspawnToolHost() (current host-mode behavior)Auth not injected. Check:
docker sandbox exec my-sandbox bash -c 'claude auth status'
docker sandbox exec my-sandbox bash -c 'cat ~/.codex/auth.json | head -3'First pull downloads ~500MB image. Subsequent creates use cached image (~14s). Use docker sandbox save to create a pre-configured template.
Only the workspace path is mounted. Files outside the workspace directory are not shared.
Docker Desktop must be running. Check version: docker sandbox version. Requires Docker Desktop 4.40+ with sandbox extension.
825972c
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.