Tools are how AI agents interact with the world. A well-designed tool is the difference between an agent that works and one that hallucinates, fails silently, or costs 10x more tokens than necessary. This skill covers tool design from schema to error handling.
44
32%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/antigravity-agent-tool-builder/SKILL.mdQuality
Discovery
22%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This description reads like a course syllabus introduction or marketing copy rather than a functional skill description for selection purposes. It lacks concrete actions, has no 'Use when...' trigger clause, and uses first-principle philosophical framing ('A well-designed tool is the difference between...') instead of actionable specifics. The description would benefit significantly from listing concrete capabilities and explicit trigger conditions.
Suggestions
Replace the philosophical opening with concrete actions, e.g., 'Designs tool schemas, writes parameter descriptions, implements error handling, and optimizes tool interfaces for AI agent function calling.'
Add an explicit 'Use when...' clause, e.g., 'Use when designing tools for AI agents, creating function calling schemas, writing MCP tool definitions, or optimizing tool descriptions for LLM consumption.'
Remove the marketing-style language ('the difference between an agent that works and one that hallucinates') and replace with natural trigger terms users would actually say, such as 'function calling', 'tool use', 'MCP tools', 'tool parameters'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | The description uses vague, abstract language like 'covers tool design from schema to error handling' without listing concrete actions. It reads more like a marketing pitch than a capability description—no specific actions like 'create tool schemas', 'validate parameters', or 'generate error handlers' are mentioned. | 1 / 3 |
Completeness | There is no explicit 'Use when...' clause or equivalent trigger guidance. The 'what' is only vaguely implied ('covers tool design from schema to error handling'), and the 'when' is entirely missing. Per the rubric, a missing 'Use when...' clause should cap completeness at 2, and the weak 'what' brings it down to 1. | 1 / 3 |
Trigger Term Quality | It includes some relevant terms like 'tool design', 'schema', 'error handling', 'AI agents', and 'tokens', but misses many natural user phrases like 'function calling', 'API tools', 'tool use', 'MCP', or 'tool definitions'. The first sentence is more philosophical than keyword-rich. | 2 / 3 |
Distinctiveness Conflict Risk | The mention of 'tool design' and 'schema' provides some specificity, but the broad framing around 'AI agents' and general concepts like 'error handling' could easily overlap with agent-building skills, API design skills, or general coding best practices skills. | 2 / 3 |
Total | 6 / 12 Passed |
Implementation
42%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill excels at actionability with concrete, executable code examples across multiple languages and frameworks, covering tool schema design, error handling, MCP, and parallel execution thoroughly. However, it is severely bloated—repeating the weather tool example across nearly every pattern, including metadata sections that don't belong in the body, and explaining concepts Claude already understands. The lack of any file decomposition for a 400+ line skill significantly hurts progressive disclosure and token efficiency.
Suggestions
Extract detailed pattern implementations (MCP server, tool runner, parallel execution) into separate referenced files, keeping only a concise overview and the most critical pattern (schema design + error handling) in SKILL.md.
Remove or relocate metadata sections (Capabilities, Scope, When to Use, Limitations, Collaboration, Delegation Triggers) to YAML frontmatter or a separate metadata file—these consume significant tokens without adding instructional value.
Consolidate the weather tool example—it appears in at least 4 different patterns. Use one canonical example and vary only the aspect being demonstrated.
Add an explicit end-to-end workflow section (e.g., 'Design schema → Write descriptions → Implement with error handling → Validate schema → Test with LLM') with validation checkpoints between steps.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is extremely verbose at ~400+ lines. It explains concepts Claude already knows (what JSON Schema is, what MCP is, basic error handling patterns, what enums are). Sections like 'Capabilities', 'Scope', 'When to Use', 'Limitations', and 'Collaboration' are metadata that bloat the content. The 'MCP Benefits' bullet list and error category taxonomy add little value. Many code examples are repetitive (weather tool appears 4+ times). | 1 / 3 |
Actionability | The skill provides fully executable code examples in both Python and TypeScript across multiple patterns (schema design, error handling, MCP servers, tool runners, parallel execution). Examples are concrete with realistic data, specific JSON schemas, and copy-paste ready implementations including the ToolResult dataclass pattern and complete MCP server setup. | 3 / 3 |
Workflow Clarity | Individual patterns are well-explained with clear when-to-use guidance, and the parallel tool execution section correctly highlights the critical 'all results in one message' requirement. However, there's no overarching workflow for building a tool from start to finish (design → implement → validate → test), and the validation checks section is a flat list without integration into a workflow. No explicit feedback loops for iterating on tool design. | 2 / 3 |
Progressive Disclosure | This is a monolithic wall of text with no references to external files despite being well over 400 lines. The detailed MCP server implementation, tool runner examples, and parallel execution patterns could each be separate reference files. There's no bundle structure to support progressive disclosure, and the content would benefit enormously from splitting into overview + detailed pattern files. | 1 / 3 |
Total | 7 / 12 Passed |
Validation
81%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 9 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
skill_md_line_count | SKILL.md is long (715 lines); consider splitting into references/ and linking | Warning |
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 9 / 11 Passed | |
431bfad
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.