CtrlK
BlogDocsLog inGet started
Tessl Logo

agent-sandbox

Agent skill for sandbox - invoke with $agent-sandbox

Install with Tessl CLI

npx tessl i github:ruvnet/claude-flow --skill agent-sandbox
What are skills?

55

4.65x

Does it follow best practices?

Evaluation93%

4.65x

Agent success when using this skill

Validation for skill structure

SKILL.md
Review
Evals

Evaluation results

90%

68%

Automated Data Analysis Pipeline

Python sandbox lifecycle management

Criteria
Without context
With context

Python template

0%

100%

Timeout set

0%

100%

Correct create function

0%

100%

File upload used

0%

100%

Correct execute function

0%

100%

capture_output enabled

0%

100%

Env vars for secrets

50%

100%

Sandbox cleanup

0%

100%

Status check

0%

0%

Error handling

100%

77%

Deployment order

100%

100%

Without context: $0.4100 · 1m 54s · 21 turns · 27 in / 6,400 out tokens

With context: $0.3891 · 1m 28s · 22 turns · 27 in / 4,913 out tokens

100%

81%

Node.js Microservice Test Harness

Node.js sandbox with package installation and status monitoring

Criteria
Without context
With context

Node template

0%

100%

install_packages used

0%

100%

Correct create function

0%

100%

Correct execute function

0%

100%

capture_output enabled

0%

100%

Status monitoring

0%

100%

Timeout specified

0%

100%

Sandbox cleanup always runs

0%

100%

Error handling present

100%

100%

Execution logging

100%

100%

Language parameter

0%

100%

Without context: $0.3561 · 1m 31s · 17 turns · 24 in / 5,668 out tokens

With context: $0.4118 · 1m 37s · 21 turns · 269 in / 5,914 out tokens

91%

72%

Cross-Platform Application Smoke Test

Multi-environment sandbox orchestration and cleanup

Criteria
Without context
With context

React template used

0%

100%

Python template used

0%

100%

Multiple sandbox_create calls

0%

100%

Correct create function

0%

100%

Status checked per environment

0%

0%

Execute in each environment

0%

100%

capture_output on executions

0%

100%

All sandboxes cleaned up

0%

100%

Cleanup despite failures

70%

100%

Timeout on each sandbox

0%

100%

Execution logging

100%

100%

Parallel or sequential orchestration

100%

100%

Without context: $0.5080 · 2m 26s · 20 turns · 25 in / 10,147 out tokens

With context: $0.4957 · 2m 7s · 22 turns · 377 in / 7,704 out tokens

Evaluated
Agent
Claude Code
Model
Unknown

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.