CtrlK
BlogDocsLog inGet started
Tessl Logo

agent-capability-analyzer

Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.

90

Quality

88%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that clearly articulates a specialized experimental capability. It provides concrete actions, explicit trigger conditions via a well-structured 'Use when...' clause, and uses distinctive terminology that minimizes conflict risk with other skills. The description effectively communicates both the mechanism (spawning agents, collecting capabilities, comparing against frontmatter) and the purpose (measuring routing reliability).

DimensionReasoningScore

Specificity

Lists multiple specific concrete actions: 'spawns all Claude Code agents simultaneously', 'collect self-reported capabilities', 'compares them against static frontmatter descriptions', 'reveal how reliable orchestrator routing based on descriptions actually is'.

3 / 3

Completeness

Clearly answers both what (runs description-drift experiment, spawns agents, collects capabilities, compares against frontmatter) AND when with explicit 'Use when...' clause covering four specific trigger scenarios.

3 / 3

Trigger Term Quality

Includes natural keywords users would say: 'description drift', 'capability collection experiment', 'self-reported capabilities', 'frontmatter descriptions', 'agent fleet', 'orchestrator routing', 'auditing'. Good coverage of domain-specific terms.

3 / 3

Distinctiveness Conflict Risk

Highly distinctive with unique niche: 'description-drift experiment', 'agent fleet', 'frontmatter descriptions', 'orchestrator routing'. Very unlikely to conflict with other skills due to specialized terminology and specific use case.

3 / 3

Total

12

/

12

Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill provides strong actionable guidance with clear workflows and executable commands for running the description-drift experiment. The main weakness is moderate verbosity—the flowchart and prose explanations are redundant, and the gap analysis categories over-explain concepts Claude can infer. Structure is good but could benefit from moving detailed output formats to separate reference files.

Suggestions

Remove the mermaid flowchart or the prose explanation of invocation modes—having both is redundant

Condense the gap analysis category definitions; Claude understands comparison tasks without extensive rationale for each category

Move the Gap Analysis Output Format template to a separate reference file (e.g., ./resources/gap-analysis-template.md)

DimensionReasoningScore

Conciseness

The skill is mostly efficient but includes some redundancy—the invocation mode flowchart and subsequent prose explanations cover the same ground. The gap analysis section could be tightened; Claude understands comparison tasks without extensive category definitions.

2 / 3

Actionability

Provides fully executable bash commands, specific script paths, concrete JSON output formats, and copy-paste ready code snippets. The workflow steps are explicit with exact commands to run.

3 / 3

Workflow Clarity

Clear numbered sequences with explicit checkpoints (wait for all Tasks, then dump, then analyze). The flowchart provides visual decision logic. Multi-step processes have clear ordering and the single vs multi-agent decision tree is well-defined.

3 / 3

Progressive Disclosure

Content is reasonably organized with clear sections, but the document is somewhat monolithic. The mermaid flowchart duplicates prose explanations. References to external files are present but the gap analysis output format could be in a separate template file.

2 / 3

Total

10

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
Jamie-BitFlight/claude_skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.