agent-orchestration-improve-agent

Systematic improvement of existing agents through performance analysis, prompt engineering, and continuous iteration.

1.54x

Quality

47%

Does it follow best practices?

Impact

82%

1.54x

Average score across 3 eval scenarios

Securityby

Passed

No known issues

Fix and improve this skill with Tessl

tessl review fix ./docs/v19.7/configuration/agent/skills_external/antigravity-awesome-skills-main/skills/agent-orchestration-improve-agent/SKILL.md

Evaluation results

85%

18%

Agent Performance Review: Support Bot Analysis

Performance baseline analysis and test suite design

Criteria

Baseline

With context

Task Success Rate field

100%

Average Corrections per Task field

100%

Tool Call Efficiency field

User Satisfaction Score field

100%

Response Latency field

100%

Token Efficiency Ratio field

33%

100%

Instruction misunderstanding category

80%

100%

Output format errors category

20%

100%

Context loss category

100%

Tool misuse category

100%

Constraint violations category

100%

Edge case handling category

Correction patterns analysis

85%

100%

Positive feedback patterns

80%

40%

Six test categories

33%

100%

Hallucination metric

40%

80%

30-day analysis period

100%

94%

55%

Deploying an Improved Expense Reporting Agent

Agent versioning, staged rollout, and rollback planning

Criteria

Baseline

With context

Version format compliance

100%

MINOR for prompt improvements

40%

100%

Git-based prompt storage

100%

Alpha stage at 5%

100%

Beta stage at 20%

100%

Canary progression to 50% then 100%

100%

87%

7-day monitoring window

100%

Rollback trigger: success rate drop >10%

100%

Rollback trigger: critical errors >5%

16%

Rollback trigger: cost increase >20%

100%

Rollback process: 5 steps

100%

Success: 15% improvement threshold

100%

Success: 25% corrections reduction

100%

Success: cost within 5%

100%

67%

13%

Improving a Document Summarization Agent

Prompt engineering with chain-of-thought, few-shot examples, and role definition

Criteria

Baseline

With context

Chain-of-thought step phrase

22%

Self-verification checkpoint phrase

66%

Good example Input field

60%

80%

Good example Reasoning field

12%

Good example 'Why this works' field

85%

71%

Bad example 'Why this fails' field

62%

75%

Bad example 'Correct approach' field

85%

100%

Role: core purpose

80%

100%

Role: constraints section

20%

100%

Role: success criteria

60%

Constitutional principle: factual accuracy

85%

Constitutional principle: format validation

83%

Constitutional principle: consistency

66%

83%

Critique-and-revise loop

50%

Simple-to-complex ordering

20%

60%

Repository: duclm1x1/Dive-Ai
Commit: 20ba150

Evaluated: 4 months ago
Agent: Claude Code
Model: Claude Sonnet 4.6

Table of Contents

Agent Performance Review: Support Bot Analysis Improving a Document Summarization Agent Deploying an Improved Expense Reporting Agent

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.