Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.
Install with Tessl CLI
npx tessl i github:sickn33/antigravity-awesome-skills --skill agent-orchestration-improve-agent59
Quality
43%
Does it follow best practices?
Impact
81%
1.24xAverage score across 3 eval scenarios
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/agent-orchestration-improve-agent/SKILL.mdDiscovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description identifies a clear domain (agent improvement) and lists relevant activities, but suffers from abstract language and completely lacks explicit trigger guidance. Without a 'Use when...' clause, Claude cannot reliably determine when to select this skill over others, and the terminology used is more technical than what users would naturally say.
Suggestions
Add an explicit 'Use when...' clause with natural trigger terms like 'agent not working', 'improve my agent', 'agent performance issues', 'fix agent behavior', or 'optimize agent prompts'.
Replace abstract terms with concrete actions: instead of 'performance analysis', specify 'analyze agent logs and error patterns'; instead of 'continuous iteration', specify 'test prompt variations and measure improvements'.
Include file types or artifacts this skill works with (e.g., 'agent configuration files', 'system prompts', 'evaluation datasets') to improve distinctiveness.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (agent improvement) and some actions (performance analysis, prompt engineering, continuous iteration), but these are somewhat abstract rather than concrete specific actions like 'analyze error logs' or 'rewrite system prompts'. | 2 / 3 |
Completeness | Describes what it does (systematic improvement of agents) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill. | 1 / 3 |
Trigger Term Quality | Includes some relevant terms like 'agents', 'prompt engineering', and 'performance analysis', but missing common variations users might say like 'fix my agent', 'agent not working', 'improve prompts', 'debug agent', or 'optimize agent'. | 2 / 3 |
Distinctiveness Conflict Risk | 'Prompt engineering' could overlap with general prompt writing skills, and 'performance analysis' is generic enough to conflict with other analysis-focused skills. The 'agents' focus provides some distinction but isn't strongly bounded. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
55%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides a comprehensive framework for agent optimization with excellent workflow structure and clear phases, but suffers from verbosity and lack of progressive disclosure. The content explains many concepts Claude already understands and would benefit from being split into multiple focused documents with concrete, executable examples rather than pseudocode references.
Suggestions
Split content into separate files (e.g., TESTING.md, DEPLOYMENT.md, PROMPT-ENGINEERING.md) and reference them from a concise overview in SKILL.md
Replace pseudocode tool references like 'Use: context-manager' with actual executable commands or clarify these are conceptual placeholders
Remove explanatory content about well-known concepts (A/B testing basics, semantic versioning) and focus on agent-specific guidance
Add concrete code examples for at least one optimization technique (e.g., actual prompt template showing chain-of-thought implementation)
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill contains useful information but is verbose in places, explaining concepts Claude likely knows (e.g., what A/B testing is, basic versioning semantics). The extensive lists and explanatory text could be significantly condensed while preserving actionability. | 2 / 3 |
Actionability | Provides structured guidance with some concrete elements like metric templates and rollback triggers, but lacks executable code. The 'Use: context-manager' and 'Use: prompt-engineer' commands are pseudocode-like references without actual implementation details or real tool invocations. | 2 / 3 |
Workflow Clarity | Clear four-phase workflow with explicit sequencing, validation checkpoints (A/B testing, staged rollout), and feedback loops (rollback procedures with specific triggers). The progression from analysis to deployment is well-structured with clear decision points. | 3 / 3 |
Progressive Disclosure | Monolithic wall of text with no references to external files. All content is inline despite being extensive enough to warrant splitting into separate reference documents (e.g., testing protocols, prompt engineering techniques, deployment procedures). | 1 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.