Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.
90
Quality
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly articulates a specialized experimental capability. It provides concrete actions, explicit trigger conditions via a well-structured 'Use when...' clause, and uses distinctive terminology that minimizes conflict risk with other skills. The description effectively communicates both the mechanism (spawning agents, collecting capabilities, comparing against frontmatter) and the purpose (measuring routing reliability).
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: 'spawns all Claude Code agents simultaneously', 'collect self-reported capabilities', 'compares them against static frontmatter descriptions', 'reveal how reliable orchestrator routing based on descriptions actually is'. | 3 / 3 |
Completeness | Clearly answers both what (runs description-drift experiment, spawns agents, collects capabilities, compares against frontmatter) AND when with explicit 'Use when...' clause covering four specific trigger scenarios. | 3 / 3 |
Trigger Term Quality | Includes natural keywords users would say: 'description drift', 'capability collection experiment', 'self-reported capabilities', 'frontmatter descriptions', 'agent fleet', 'orchestrator routing', 'auditing'. Good coverage of domain-specific terms. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive with unique niche: 'description-drift experiment', 'agent fleet', 'frontmatter descriptions', 'orchestrator routing'. Very unlikely to conflict with other skills due to specialized terminology and specific use case. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This skill provides strong actionable guidance with clear workflows and executable commands for running the description-drift experiment. The main weakness is moderate verbosity—the flowchart and prose explanations are redundant, and the gap analysis categories over-explain concepts Claude can infer. Structure is good but could benefit from moving detailed output formats to separate reference files.
Suggestions
Remove the mermaid flowchart or the prose explanation of invocation modes—having both is redundant
Condense the gap analysis category definitions; Claude understands comparison tasks without extensive rationale for each category
Move the Gap Analysis Output Format template to a separate reference file (e.g., ./resources/gap-analysis-template.md)
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is mostly efficient but includes some redundancy—the invocation mode flowchart and subsequent prose explanations cover the same ground. The gap analysis section could be tightened; Claude understands comparison tasks without extensive category definitions. | 2 / 3 |
Actionability | Provides fully executable bash commands, specific script paths, concrete JSON output formats, and copy-paste ready code snippets. The workflow steps are explicit with exact commands to run. | 3 / 3 |
Workflow Clarity | Clear numbered sequences with explicit checkpoints (wait for all Tasks, then dump, then analyze). The flowchart provides visual decision logic. Multi-step processes have clear ordering and the single vs multi-agent decision tree is well-defined. | 3 / 3 |
Progressive Disclosure | Content is reasonably organized with clear sections, but the document is somewhat monolithic. The mermaid flowchart duplicates prose explanations. References to external files are present but the gap analysis output format could be in a separate template file. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
fd243f9
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.