CtrlK
BlogDocsLog inGet started
Tessl Logo

agent-capability-analyzer

Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.

72

Quality

88%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates a specific, niche capability with concrete actions and explicit trigger conditions. It uses third person voice correctly, provides a comprehensive 'Use when...' clause with multiple natural trigger scenarios, and occupies a highly distinctive domain that minimizes conflict risk with other skills.

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: spawns Claude Code agents, collects self-reported capabilities, compares against static frontmatter descriptions, reveals routing reliability. These are detailed, concrete operations.

3 / 3

Completeness

Clearly answers both what (spawns agents, collects self-reported capabilities, compares against frontmatter descriptions) and when (explicit 'Use when...' clause with four specific trigger scenarios: measuring description drift, re-running experiments, analyzing specific agent capabilities, auditing frontmatter accuracy).

3 / 3

Trigger Term Quality

Includes strong natural trigger terms: 'description drift', 'capability collection experiment', 'self-reported capabilities', 'frontmatter descriptions', 'orchestrator routing', 'agent fleet', 'auditing'. These are terms a user in this domain would naturally use.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive niche — 'description-drift experiment' is a very specific concept unlikely to overlap with other skills. The combination of agent spawning, capability collection, and frontmatter comparison creates a unique fingerprint.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a well-structured, actionable skill with clear workflows and concrete commands. Its main weakness is moderate verbosity — the invocation mode section duplicates information between the flowchart and prose, and the gap analysis categories include explanatory context that could be trimmed. The skill effectively handles branching logic (single vs multi-agent, bash vs no-bash) and provides a clear output format for analysis results.

Suggestions

Remove the prose re-explanation of single-agent and multi-agent modes below the Mermaid flowchart — the flowchart already captures the logic clearly, and the duplication adds ~20 lines of redundant content.

Trim the gap analysis category descriptions to just the category name and flag label, removing the 'possible causes' explanations (e.g., 'description is aspirational, the agent's training doesn't surface that behavior...') which Claude can infer.

DimensionReasoningScore

Conciseness

The content is mostly efficient but includes some unnecessary explanation. The 'Overview' section explains what description drift is and why it matters — concepts Claude can infer. The invocation mode section duplicates information between the flowchart and the prose explanations below it. The gap analysis categories include explanatory 'possible causes' text that adds bulk without actionable value.

2 / 3

Actionability

The skill provides fully executable bash commands, specific script paths, concrete JSON output formats, and a clear gap analysis output template. Commands are copy-paste ready with clear variable substitution points (AGENT_ID_HERE, AGENT_NAME). The workflow steps are specific and concrete.

3 / 3

Workflow Clarity

The full experiment workflow is clearly sequenced with numbered steps, explicit handling of edge cases (agents without Bash access), and a clear decision flowchart for invocation mode selection. The workflow includes validation checkpoints (waiting for all Tasks to complete before dumping, checking for agent-map.json existence) and handles branching paths explicitly.

3 / 3

Progressive Disclosure

The content is well-structured with clear sections (Overview, Invocation Mode, Setup, Full Workflow, Single-Agent, Gap Analysis, Resources), but the gap analysis section is quite lengthy and could be split into a separate reference file. The Mermaid flowchart followed by prose re-explanation of the same logic inflates the main file. References to external files (template, scripts) are clearly signaled in the Resources section.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
Jamie-BitFlight/claude_skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.