Tools are how AI agents interact with the world. A well-designed tool is the difference between an agent that works and one that hallucinates, fails silently, or costs 10x more tokens than necessary. This skill covers tool design from schema to error handling.
35
32%
Does it follow best practices?
Impact
—
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/agent-tool-builder/SKILL.mdQuality
Discovery
22%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description reads more like an introductory paragraph for a blog post than a functional skill description. It lacks concrete actions, has no explicit 'Use when...' trigger clause, and relies on vague motivational language ('the difference between an agent that works and one that hallucinates') rather than specifying what the skill actually does or when it should be selected.
Suggestions
Replace the motivational framing with concrete actions, e.g., 'Designs tool schemas, writes tool descriptions, implements error handling and validation for AI agent tool-use interfaces.'
Add an explicit 'Use when...' clause with natural trigger terms, e.g., 'Use when the user asks about designing tools for agents, writing function-calling schemas, MCP tool definitions, or improving tool descriptions.'
Remove the editorial/marketing language ('the difference between an agent that works and one that hallucinates, fails silently, or costs 10x more tokens') which adds no selection value.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language like 'interact with the world' and 'tool design from schema to error handling.' It does not list concrete actions the skill performs—it reads more like a marketing pitch than a capability description. | 1 / 3 |
Completeness | The 'what' is vaguely implied (covers tool design) but not concretely stated, and there is no 'when' clause or explicit trigger guidance at all. The missing 'Use when...' clause caps this at 2 per the rubric, and the weak 'what' brings it to 1. | 1 / 3 |
Trigger Term Quality | It includes some relevant terms like 'tool design', 'schema', 'error handling', 'AI agents', and 'tokens', but these are somewhat technical and miss common user phrasings like 'function calling', 'API tools', 'tool_use', or 'MCP tools'. | 2 / 3 |
Distinctiveness Conflict Risk | The mention of 'tool design', 'schema', and 'error handling' for AI agents provides some specificity, but the broad framing ('how AI agents interact with the world') could overlap with general agent-building or API design skills. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability with rich, executable code examples across multiple patterns and languages, but suffers significantly from verbosity and poor progressive disclosure. It reads as a comprehensive reference document rather than a lean skill file—repeating the weather tool example across multiple sections, explaining concepts Claude already knows, and including metadata-like sections (Capabilities, Scope, When to Use, Limitations) that belong in frontmatter. The lack of any bundle files means all content is monolithically packed into one oversized file.
Suggestions
Split the content into a concise SKILL.md overview (~50-80 lines covering principles and quick-start) with separate bundle files for each pattern (SCHEMA_DESIGN.md, ERROR_HANDLING.md, MCP_GUIDE.md, TOOL_RUNNER.md, PARALLEL_EXECUTION.md).
Remove sections that belong in YAML frontmatter: 'Capabilities', 'Scope', 'When to Use', 'Limitations', and 'Related Skills' are metadata, not instructional content.
Eliminate redundant explanations Claude already knows (what JSON Schema is, what MCP stands for, basic error categories like 'API unavailable' and 'Permission denied') and consolidate the weather tool example which appears in at least 4 different sections.
Add a clear end-to-end workflow section: 'Building a Tool' with numbered steps (1. Define schema → 2. Write descriptions → 3. Implement with error handling → 4. Validate against checklist → 5. Test with LLM) to tie the patterns together.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines. It explains concepts Claude already knows (what JSON Schema is, what MCP is, basic error handling categories), includes unnecessary sections like 'Capabilities', 'Scope', 'When to Use' keyword lists, and 'Limitations' boilerplate that belong in frontmatter not body content. The 'Tooling' section listing frameworks with one-line notes adds little actionable value. Many code examples are repetitive (weather tool appears 4+ times). | 1 / 3 |
Actionability | The skill provides extensive, concrete, executable code examples across multiple languages (Python, TypeScript) and patterns. The JSON Schema examples are copy-paste ready, the MCP server implementation is complete, the error handling pattern with ToolResult dataclass is fully executable, and the parallel tool execution pattern includes both correct and incorrect approaches. | 3 / 3 |
Workflow Clarity | Individual patterns are well-explained but there's no clear overall workflow for building a tool from start to finish. The validation checks section lists what to verify but doesn't integrate into a sequential build-validate-test workflow. There are no explicit feedback loops or checkpoints for the tool creation process itself, though the error handling pattern does show a validate-before-execute approach within tool implementations. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no bundle files and no references to external documents. All content—schema design, examples, error handling, MCP, tool runners, parallel execution, validation checks, delegation triggers—is crammed into a single file. The 'Delegation Triggers' and 'Related Skills' sections hint at a broader ecosystem but the core content desperately needs splitting into separate reference files (e.g., MCP_GUIDE.md, ERROR_HANDLING.md, SCHEMA_PATTERNS.md). | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (715 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
8854d4e
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.