CtrlK
BlogDocsLog inGet started
Tessl Logo

sandbox-bridge

Use when you need to exercise a real, running Sandbox deployment via HTTP — for example to validate SDK changes against a live container, reproduce a user-reported issue, or experiment with the API (including FUSE bucket mounts) without spinning up `wrangler dev`. Documents the Sandbox bridge worker reachable via `SANDBOX_WORKER_URL` + `SANDBOX_API_KEY` when the host injects them.

80

Quality

75%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.agents/skills/sandbox-bridge/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

85%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong, well-crafted description that clearly defines its niche around live Sandbox deployment testing via HTTP. It excels at completeness with an explicit 'Use when' clause and multiple concrete use cases. The main weakness is that trigger terms lean heavily technical, which is appropriate for the domain but could miss some natural user phrasings.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: validate SDK changes against a live container, reproduce user-reported issues, experiment with the API including FUSE bucket mounts. Also mentions specific technical details like the bridge worker, SANDBOX_WORKER_URL, and SANDBOX_API_KEY.

3 / 3

Completeness

Clearly answers both 'what' (exercise a running Sandbox deployment via HTTP, documents the bridge worker) and 'when' (explicit 'Use when' clause listing scenarios: validate SDK changes, reproduce issues, experiment with API without wrangler dev, when host injects SANDBOX_WORKER_URL + SANDBOX_API_KEY).

3 / 3

Trigger Term Quality

Includes some relevant keywords like 'Sandbox', 'HTTP', 'API', 'FUSE bucket mounts', 'wrangler dev', 'SANDBOX_WORKER_URL', 'SANDBOX_API_KEY'. However, these are fairly technical/specific and may miss common variations a user might naturally say (e.g., 'test sandbox', 'call sandbox API', 'live sandbox testing').

2 / 3

Distinctiveness Conflict Risk

Highly distinctive with very specific triggers around Sandbox deployments, bridge workers, SANDBOX_WORKER_URL, SANDBOX_API_KEY, and FUSE bucket mounts. Unlikely to conflict with other skills due to the narrow, well-defined niche.

3 / 3

Total

11

/

12

Passed

Implementation

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill with excellent executable examples covering the full sandbox lifecycle. Its main weaknesses are moderate verbosity (some sections could be tightened) and the lack of explicit validation/error-recovery steps in the multi-step workflow. The content would benefit from splitting detailed reference material (sessions, SSE events, error codes) into separate files.

Suggestions

Add explicit validation checkpoints to the workflow — e.g., check the HTTP status code after sandbox creation before proceeding to exec, and verify exit_code from SSE stream before continuing.

Consider extracting the Sessions section, SSE event details, and Error Codes into separate reference files to reduce the main skill's token footprint and improve progressive disclosure.

DimensionReasoningScore

Conciseness

The content is mostly efficient with good use of tables and code examples, but includes some unnecessary explanation (e.g., the 'When to Use This vs wrangler dev' section has some verbose justification, and the session explanation could be tighter). The intro paragraph explaining what a bridge is adds modest overhead.

2 / 3

Actionability

Every operation has fully executable, copy-paste-ready curl commands with proper headers, variable substitution, and jq parsing. The SSE event table, error codes, and the awk helper for decoding base64 stdout are all concrete and immediately usable.

3 / 3

Workflow Clarity

The create → exec → read/write → destroy flow is clearly sequenced and numbered. However, there are no explicit validation checkpoints or error-recovery feedback loops — the destroy step says 'always clean up' but there's no guidance on verifying sandbox state or handling failures mid-workflow, which matters for a destructive DELETE operation.

2 / 3

Progressive Disclosure

The content is well-structured with clear sections and tables, and appropriately defers to `/v1/openapi.json` and source paths for deeper details. However, the document is quite long (~200+ lines of inline content) and some sections like Sessions and the full SSE event documentation could be split into separate reference files. The reference to the 'examples skill' for wrangler dev is a good cross-reference but no bundle files exist to support it.

2 / 3

Total

9

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
cloudflare/sandbox-sdk
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.