session-execution

Use when working on or reviewing session execution, command handling, shell state, FIFO-based streaming, or stdout/stderr separation. Relevant for session.ts, command handlers, exec/execStream, or anything involving shell process management. (project)

Quality

72%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./.agents/skills/session-execution/SKILL.md

Quality

Discovery

72%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description excels at specifying when to use the skill and includes highly distinctive, domain-specific trigger terms that minimize conflict risk. However, it lacks a clear statement of what the skill actually does—it describes the topic area but not the concrete actions or guidance it provides. Adding explicit capability statements (e.g., 'Guides implementation of...', 'Reviews and debugs...') would significantly improve it.

Suggestions

Add concrete action verbs describing what the skill does, e.g., 'Guides implementation and debugging of session execution, command handling, and shell state management.'

Clarify the 'what' portion by listing specific capabilities such as 'Reviews FIFO-based streaming logic, debugs stdout/stderr separation issues, and validates command handler implementations.'

Dimension	Reasoning	Score
Specificity	The description names the domain (session execution, command handling, shell state) and mentions some specific concepts (FIFO-based streaming, stdout/stderr separation, exec/execStream), but doesn't list concrete actions the skill performs—it focuses on topics rather than what it does with them.	2 / 3
Completeness	The 'when' clause is explicit and well-defined ('Use when working on or reviewing session execution...'), but the 'what does this do' part is weak—it describes the domain/topics but never states what actions or guidance the skill actually provides.	2 / 3
Trigger Term Quality	Includes strong natural trigger terms that a developer would use: 'session.ts', 'command handlers', 'exec/execStream', 'FIFO-based streaming', 'stdout/stderr separation', 'shell process management'. These are specific technical terms a user working in this domain would naturally mention.	3 / 3
Distinctiveness Conflict Risk	The description targets a very specific niche—FIFO-based streaming, stdout/stderr separation, session.ts, shell process management—making it highly unlikely to conflict with other skills. The combination of these terms creates a clear, distinct identity.	3 / 3
	Total	10 / 12 Passed

Implementation

72%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured skill that efficiently communicates a complex system's architecture and review guidelines. Its greatest strength is the concise, expert-level treatment of race condition analysis with clear distinction between false positives and real concerns. The main weakness is the lack of concrete executable examples for testing scenarios mentioned in the 'When Developing' section.

Suggestions

Add concrete test commands or code snippets for the 'When Developing' section, e.g., specific commands to test silent commands (cd, variable assignment) and large output scenarios.

Include a brief validation workflow for development changes, e.g., 'After modifying session.ts: 1. Run unit tests with X, 2. Test silent command Y, 3. Verify FIFO cleanup with Z.'

Dimension	Reasoning	Score
Conciseness	The content is lean and efficient. It assumes Claude understands shell concepts, FIFOs, mutexes, and file operations without explaining them. Every section delivers domain-specific knowledge that Claude wouldn't already know about this particular system's architecture.	3 / 3
Actionability	The skill provides concrete guidance on what to check and what to watch for (e.g., atomic exit code handling, FIFO cleanup, labelers.done), and helpfully distinguishes false positives from real concerns. However, it lacks executable code examples or specific commands for testing the scenarios mentioned (e.g., testing silent commands or large output).	2 / 3
Workflow Clarity	The 'When Reviewing' section provides a clear 3-step process for race condition analysis, and the correctness checks are well-structured. However, the 'When Developing' section is a list of things to understand and test without explicit steps or validation checkpoints, and there's no feedback loop for catching issues during development.	2 / 3
Progressive Disclosure	The skill serves as a clear overview that points to `docs/SESSION_EXECUTION.md` for detailed architecture and `docs/CONCURRENCY.md` for the concurrency model — one level deep, well-signaled references. Key files are listed for navigation. The content is appropriately scoped for an overview without being monolithic.	3 / 3
	Total	10 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: cloudflare/sandbox-sdk
Commit: 3b58a22

Reviewed: 5 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.