agent-sandbox

Agent skill for sandbox - invoke with $agent-sandbox

4.65x

Quality

47%

Does it follow best practices?

Impact

93%

4.65x

Average score across 3 eval scenarios

Securityby

High

Do not use without reviewing

Fix and improve this skill with Tessl

tessl review fix ./.agents/skills/agent-sandbox/SKILL.md

Evaluation results

90%

68%

Automated Data Analysis Pipeline

Python sandbox lifecycle management

Criteria

Baseline

With context

Python template

100%

Timeout set

100%

Correct create function

100%

File upload used

100%

Correct execute function

100%

capture_output enabled

100%

Env vars for secrets

50%

100%

Sandbox cleanup

100%

Status check

Error handling

100%

77%

Deployment order

100%

91%

72%

Cross-Platform Application Smoke Test

Multi-environment sandbox orchestration and cleanup

Criteria

Baseline

With context

React template used

100%

Python template used

100%

Multiple sandbox_create calls

100%

Correct create function

100%

Status checked per environment

Execute in each environment

100%

capture_output on executions

100%

All sandboxes cleaned up

100%

Cleanup despite failures

70%

100%

Timeout on each sandbox

100%

Execution logging

100%

Parallel or sequential orchestration

100%

81%

Node.js Microservice Test Harness

Node.js sandbox with package installation and status monitoring

Criteria

Baseline

With context

Node template

100%

install_packages used

100%

Correct create function

100%

Correct execute function

100%

capture_output enabled

100%

Status monitoring

100%

Timeout specified

100%

Sandbox cleanup always runs

100%

Error handling present

100%

Execution logging

100%

Language parameter

100%

Repository: ruvnet/ruflo
Path: .agents/skills/agent-sandbox/SKILL.md
Commit: 26c35b5

Evaluated: 5 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Automated Data Analysis Pipeline Node.js Microservice Test Harness Cross-Platform Application Smoke Test

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.