code-mode

Add a "code mode" tool to an existing MCP server so LLMs can write small processing scripts that run against large API responses in a sandboxed runtime — only the script's compact output enters the LLM context window. Use this skill whenever someone wants to add code mode, context reduction, script execution, sandbox execution, or LLM-generated-code processing to an MCP server. Also trigger when users mention reducing token usage, shrinking API responses, running user-provided code safely, or adding a code execution tool to their MCP server — in any language (TypeScript, Python, Go, Rust, etc.).

2.20x

Quality

85%

Does it follow best practices?

Impact

95%

2.20x

Average score across 3 eval scenarios

Securityby

Advisory

Suggest reviewing before use

Quality

Content

70%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured interactive planning skill with excellent workflow clarity and progressive disclosure. Its main weaknesses are moderate verbosity in the introductory sections (explaining concepts Claude already knows) and a lack of concrete, executable code examples — the skill describes what to build but never shows a single implementation snippet, which is a notable gap for a coding-focused skill.

Suggestions

Add at least one concrete, executable code example showing a minimal sandbox executor implementation (e.g., a quickjs-emscripten executor in TypeScript or a goja executor in Go) so Claude has a reference pattern to adapt.

Trim the 'What is Code Mode?' section significantly — Claude already understands context windows, token costs, and the concept of processing data server-side. A 1-2 sentence summary would suffice.

Add a minimal code example for the MCP tool handler showing the DATA injection pattern and the reduction measurement output format, rather than only describing it in prose.

Dimension	Reasoning	Score
Conciseness	The skill includes some unnecessary explanation (e.g., the 'What is Code Mode?' section explains concepts Claude already understands like context windows and token consumption). The sandbox tables are useful but the introductory paragraphs could be significantly tightened. However, the structured tables and step-by-step format are reasonably efficient.	2 / 3
Actionability	The skill provides a clear interactive planning framework and good structural guidance (what the tool accepts, what the executor does, implementation order), but lacks any executable code examples. There are no concrete code snippets for the sandbox executor, tool handler, or wiring — only descriptions of what they should do. For a skill about implementing code, the absence of even one reference implementation is a significant gap.	2 / 3
Workflow Clarity	The 5-step workflow is clearly sequenced with explicit confirmation checkpoints ('confirm before moving on', 'present the plan to the user and confirm before implementing'). The implementation order within Step 4 is well-defined, and Step 5 provides a verification/benchmark phase. The interactive nature with user confirmation at each step serves as a validation mechanism.	3 / 3
Progressive Disclosure	The skill is well-structured as an overview with clear references to deeper content: sandbox-options.md for detailed tradeoffs and benchmark-pattern.md for the benchmark template. References are one level deep, clearly signaled, and the main document provides enough context (quick selection tables) to be useful without requiring the referenced files.	3 / 3
	Total	10 / 12 Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates a specific, niche capability (adding code mode to MCP servers for context reduction via sandboxed script execution). It provides comprehensive trigger terms covering multiple natural phrasings and explicitly addresses both what the skill does and when it should be used. The description is well-structured, uses third person voice, and is distinctive enough to avoid conflicts with other skills.

Dimension	Reasoning	Score
Specificity	The description lists multiple concrete actions: adding a 'code mode' tool, writing processing scripts against large API responses, running them in a sandboxed runtime, and returning compact output to the LLM context window. These are specific, actionable capabilities.	3 / 3
Completeness	Clearly answers both 'what' (add a code mode tool to an MCP server for sandboxed script execution against API responses) and 'when' (explicit 'Use this skill whenever...' and 'Also trigger when...' clauses with detailed trigger conditions).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms: 'code mode', 'context reduction', 'script execution', 'sandbox execution', 'reducing token usage', 'shrinking API responses', 'running user-provided code safely', 'code execution tool', 'MCP server', and specific languages (TypeScript, Python, Go, Rust). These cover many natural ways a user might phrase their request.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive — the combination of MCP server augmentation, sandboxed code execution, and context window reduction is a very specific niche. The description clearly scopes to adding a code execution tool to MCP servers specifically, making it unlikely to conflict with general coding or general MCP skills.	3 / 3
	Total	12 / 12 Passed

Validation

100%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository: chenhunghan/code-mode-skill
Commit: 3a2240a

Reviewed: 3 months ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.