CtrlK
BlogDocsLog inGet started
Tessl Logo

hosted-agents

This skill should be used when the user asks to "build background agent", "create hosted coding agent", "set up sandboxed execution", "implement multiplayer agent", or mentions background agents, sandboxed VMs, agent infrastructure, Modal sandboxes, self-spawning agents, or remote coding environments.

43

Quality

28%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/hosted-agents/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

37%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This description is essentially a trigger-term list with no explanation of what the skill actually does. While the trigger terms are strong and specific, the complete absence of capability descriptions makes it impossible for Claude to understand the skill's purpose or differentiate it from other agent-related skills. The description needs a clear 'what it does' section listing concrete actions and outputs.

Suggestions

Add a concrete capability statement before the trigger clause, e.g., 'Scaffolds background coding agents with Modal sandbox infrastructure, configures VM environments, sets up agent spawning logic, and implements multiplayer agent coordination.'

Restructure to follow the pattern: '[What it does]. Use when [triggers].' — currently the entire description is only the 'Use when' clause with no 'what' component.

Include specific outputs or artifacts the skill produces (e.g., 'generates Dockerfiles, Modal deployment configs, agent orchestration code') to help Claude distinguish this from general infrastructure or agent skills.

DimensionReasoningScore

Specificity

The description contains no concrete actions or capabilities — it only lists trigger phrases. There is no explanation of what the skill actually does (e.g., 'creates infrastructure for...', 'configures Modal sandboxes...', 'deploys agents to...'). It is entirely vague on the 'what'.

1 / 3

Completeness

The description answers 'when' extensively but completely fails to answer 'what does this do'. There is no explanation of the skill's capabilities, outputs, or concrete actions. The rubric states missing 'what' OR 'when' should score 1.

1 / 3

Trigger Term Quality

The description includes a rich set of natural trigger terms users would say: 'build background agent', 'create hosted coding agent', 'set up sandboxed execution', 'Modal sandboxes', 'self-spawning agents', 'remote coding environments'. These are specific and varied enough to cover common user phrasings.

3 / 3

Distinctiveness Conflict Risk

The trigger terms like 'Modal sandboxes', 'background agents', and 'sandboxed VMs' are fairly niche, but without describing what the skill actually does, it could overlap with other agent-related or infrastructure skills. The specificity of some terms (e.g., 'Modal sandboxes') helps, but the lack of capability description introduces ambiguity.

2 / 3

Total

7

/

12

Passed

Implementation

20%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads more like an architectural whitepaper or design document than an actionable skill file. It provides extensive strategic reasoning and conceptual guidance but lacks the concrete, executable examples that would make it useful for Claude to actually build hosted agent infrastructure. The extreme verbosity and absence of code examples are its most significant weaknesses.

Suggestions

Add concrete, executable code examples for key operations: a Modal sandbox definition, a Dockerfile for image building, git configuration commands, WebSocket streaming setup, and session state management with SQLite.

Cut the 'because...' justifications throughout - Claude doesn't need rationale for architectural decisions, just the decisions themselves. This could reduce the content by 30-40%.

Move detailed subsections (Client Implementations, Multiplayer Support, Authentication) into separate reference files and keep SKILL.md as a concise overview with links.

Add a concrete end-to-end workflow with validation steps: e.g., 'Build image → Verify image health → Start sandbox → Validate sandbox ready → Run agent → Extract results → Verify PR created'.

DimensionReasoningScore

Conciseness

The skill is extremely verbose at ~300+ lines, with extensive explanatory rationale ('because...') for nearly every point. Much of this explains architectural reasoning Claude already understands. Sections like 'Why Multiplayer Matters' and 'Adoption Strategy' are strategic advice, not actionable skill content. The repeated 'because' justifications add significant token bloat.

1 / 3

Actionability

Despite the length, there is almost no executable code, no concrete commands, no specific API calls, and no copy-paste ready examples. Everything is described at a conceptual/architectural level (e.g., 'Pre-build environment images', 'Take filesystem snapshots') without showing how to actually implement any of it. No Dockerfiles, no Modal sandbox code, no actual git commands, no API endpoint definitions.

1 / 3

Workflow Clarity

The Sandbox-to-API Flow section provides a clear 4-step sequence, and the Guidelines section lists ordered priorities. However, most multi-step processes (image building, sandbox lifecycle, session teardown) lack explicit validation checkpoints or feedback loops. The Gotchas section partially compensates by identifying failure modes but doesn't integrate them into workflows.

2 / 3

Progressive Disclosure

The References section provides well-signaled links to external resources and related skills with 'Read when' guidance, which is good. However, the main body is a monolithic wall of text that could benefit from splitting detailed topics (sandbox infrastructure, client implementations, multiplayer) into separate reference files. The inline content is far too long for a SKILL.md overview.

2 / 3

Total

6

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
muratcankoylan/Agent-Skills-for-Context-Engineering
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.