CtrlK
BlogDocsLog inGet started
Tessl Logo

agent-capability-analyzer

Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.

90

Quality

88%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Evaluation results

59%

22%

Agent Routing Audit

Description drift gap analysis

Criteria
Without context
With context

Three named categories

0%

100%

Agent identifier present

100%

100%

Description field shown

100%

100%

Capabilities field shown

50%

50%

Undocumented: architecture review

66%

100%

Underclaimed: TDD or pytest specificity

70%

100%

python-test-writer verdict: SIGNIFICANT_DRIFT

0%

0%

data-pipeline-builder verdict: ALIGNED

0%

0%

No scope violations flagged

62%

0%

Verdict line format

0%

100%

Without context: $0.1612 · 1m 3s · 8 turns · 13 in / 3,226 out tokens

With context: $0.2521 · 1m 14s · 11 turns · 15 in / 3,452 out tokens

100%

60%

Targeted Agent Capability Spot-Check

Single-agent mode orchestration

Criteria
Without context
With context

Template read before spawn

100%

100%

No-commands instruction appended

0%

100%

AGENT_ID_HERE substituted

100%

100%

subagent_type used for spawn

0%

100%

agent-map.json checked first

0%

100%

No update script in single mode

100%

100%

Gap analysis: undocumented Avro/example

100%

100%

Gap analysis verdict

0%

100%

Verdict line format

0%

100%

Without context: $0.5510 · 2m 53s · 24 turns · 29 in / 9,485 out tokens

With context: $1.1557 · 4m 39s · 39 turns · 107 in / 16,137 out tokens

98%

93%

Fleet Description-Drift Experiment Setup

Multi-agent experiment setup and orchestration

Criteria
Without context
With context

level@^10.0.0 in package.json

0%

100%

pnpm install for setup

0%

100%

populate-agent-descriptions first

0%

100%

Simultaneous agent spawning

0%

86%

Bash-less agent handled by orchestrator

0%

100%

Wait before dump

62%

100%

Dump to canonical location

0%

100%

Correct dump command

0%

100%

AGENT_ID_HERE substituted per agent

0%

100%

Without context: $1.2005 · 3m 33s · 33 turns · 40 in / 11,419 out tokens

With context: $1.7499 · 6m 15s · 29 turns · 3,254 in / 22,052 out tokens

Repository
Jamie-BitFlight/claude_skills
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.