Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.
90
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, well-crafted description that clearly communicates a highly specific experimental workflow. It excels at specificity by naming concrete actions, provides an explicit 'Use when...' clause with multiple trigger scenarios, and occupies a very distinct niche that minimizes conflict risk. The domain-specific terminology is appropriate given the specialized nature of the skill.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: spawns Claude Code agents simultaneously, collects self-reported capabilities, compares against static frontmatter descriptions, and reveals routing reliability. These are detailed, concrete operations. | 3 / 3 |
Completeness | Clearly answers both 'what' (spawns agents, collects self-reported capabilities, compares against frontmatter descriptions) and 'when' with an explicit 'Use when...' clause covering four distinct trigger scenarios. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms: 'description drift', 'capability collection experiment', 'self-reported capabilities', 'frontmatter descriptions', 'orchestrator routing', 'agent fleet', 'auditing'. These are terms a user in this domain would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive — 'description-drift experiment' is a very specific niche concept. The combination of agent spawning, self-reported capabilities, and frontmatter comparison creates a unique fingerprint unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill with clear multi-step workflows and good decision logic for routing between single-agent and multi-agent modes. Its main weakness is moderate verbosity — the opening context paragraph and the detailed gap analysis explanations (cause speculation for each category) add tokens without proportional value. The mermaid diagram is a nice touch for workflow clarity but contributes to overall length.
Suggestions
Remove the opening explanation of what description drift is and why orchestrators care — Claude can infer this from the skill's operational content.
Move the detailed Gap Analysis category explanations and output format to a separate reference file (e.g., GAP-ANALYSIS.md) and link to it, keeping only a brief summary inline.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is reasonably efficient but includes some explanatory content that could be trimmed — e.g., the opening paragraph explaining what description drift is and why it matters is context Claude can infer. The gap analysis category explanations are somewhat verbose with cause speculation. However, most content is operationally relevant. | 2 / 3 |
Actionability | Provides fully executable bash commands, specific script paths, concrete JSON field references, and exact command-line invocations for every step. The template usage instructions, the node one-liner for lookups, and the update-agent-map.mjs commands are all copy-paste ready. | 3 / 3 |
Workflow Clarity | The full experiment workflow is clearly sequenced with numbered steps, explicit wait-for-completion checkpoints, and branching logic for agents with/without Bash access. The mermaid flowchart provides excellent visual sequencing. The single-agent vs multi-agent decision tree is explicit with clear routing criteria. | 3 / 3 |
Progressive Disclosure | The skill references external files (template, scripts) appropriately, but the content itself is quite long with the gap analysis section inlined rather than referenced. The mermaid diagram, while helpful, adds significant length. The resources section at the end is well-organized but the body could benefit from moving the detailed gap analysis categories and output format to a separate reference file. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
11ec483
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.