CtrlK
BlogDocsLog inGet started
Tessl Logo

agent-capability-analyzer

Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.

90

Quality

88%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is a strong, well-crafted description that clearly communicates a highly specific experimental workflow. It excels at specificity by naming concrete actions, provides an explicit 'Use when...' clause with multiple trigger scenarios, and occupies a very distinct niche that minimizes conflict risk. The domain-specific terminology is appropriate given the specialized nature of the skill.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: spawns Claude Code agents simultaneously, collects self-reported capabilities, compares against static frontmatter descriptions, and reveals routing reliability. These are detailed, concrete operations.

3 / 3

Completeness

Clearly answers both 'what' (spawns agents, collects self-reported capabilities, compares against frontmatter descriptions) and 'when' with an explicit 'Use when...' clause covering four distinct trigger scenarios.

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'description drift', 'capability collection experiment', 'self-reported capabilities', 'frontmatter descriptions', 'orchestrator routing', 'agent fleet', 'auditing'. These are terms a user in this domain would naturally use.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive — 'description-drift experiment' is a very specific niche concept. The combination of agent spawning, self-reported capabilities, and frontmatter comparison creates a unique fingerprint unlikely to conflict with other skills.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, highly actionable skill with clear multi-step workflows and good decision logic for routing between single-agent and multi-agent modes. Its main weakness is moderate verbosity — the opening context paragraph and the detailed gap analysis explanations (cause speculation for each category) add tokens without proportional value. The mermaid diagram is a nice touch for workflow clarity but contributes to overall length.

Suggestions

Remove the opening explanation of what description drift is and why orchestrators care — Claude can infer this from the skill's operational content.

Move the detailed Gap Analysis category explanations and output format to a separate reference file (e.g., GAP-ANALYSIS.md) and link to it, keeping only a brief summary inline.

DimensionReasoningScore

Conciseness

The skill is reasonably efficient but includes some explanatory content that could be trimmed — e.g., the opening paragraph explaining what description drift is and why it matters is context Claude can infer. The gap analysis category explanations are somewhat verbose with cause speculation. However, most content is operationally relevant.

2 / 3

Actionability

Provides fully executable bash commands, specific script paths, concrete JSON field references, and exact command-line invocations for every step. The template usage instructions, the node one-liner for lookups, and the update-agent-map.mjs commands are all copy-paste ready.

3 / 3

Workflow Clarity

The full experiment workflow is clearly sequenced with numbered steps, explicit wait-for-completion checkpoints, and branching logic for agents with/without Bash access. The mermaid flowchart provides excellent visual sequencing. The single-agent vs multi-agent decision tree is explicit with clear routing criteria.

3 / 3

Progressive Disclosure

The skill references external files (template, scripts) appropriately, but the content itself is quite long with the gap analysis section inlined rather than referenced. The mermaid diagram, while helpful, adds significant length. The resources section at the end is well-organized but the body could benefit from moving the detailed gap analysis categories and output format to a separate reference file.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
Jamie-BitFlight/claude_skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.