Runs the description-drift experiment — spawns all Claude Code agents simultaneously to collect self-reported capabilities, then compares them against static frontmatter descriptions to reveal how reliable orchestrator routing based on descriptions actually is. Use when measuring description drift across the agent fleet, re-running the capability collection experiment, analyzing a specific agent's self-reported capabilities, or auditing whether frontmatter descriptions accurately reflect agent behavior.
90
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong, well-crafted description that clearly communicates a highly specific experimental workflow. It excels at specificity by naming concrete actions, provides an explicit 'Use when...' clause with multiple trigger scenarios, and occupies a very distinct niche that minimizes conflict risk. The domain-specific terminology is appropriate given the specialized nature of the skill.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: spawns Claude Code agents simultaneously, collects self-reported capabilities, compares against static frontmatter descriptions, and reveals routing reliability. These are detailed, concrete operations. | 3 / 3 |
Completeness | Clearly answers both 'what' (spawns agents, collects self-reported capabilities, compares against frontmatter descriptions) and 'when' with an explicit 'Use when...' clause covering four distinct trigger scenarios. | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms: 'description drift', 'capability collection experiment', 'self-reported capabilities', 'frontmatter descriptions', 'orchestrator routing', 'agent fleet', 'auditing'. These are terms a user in this domain would naturally use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive — 'description-drift experiment' is a very specific niche concept. The combination of agent spawning, self-reported capabilities, and frontmatter comparison creates a unique fingerprint unlikely to conflict with other skills. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
77%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured, highly actionable skill with clear workflows and concrete commands at every step. Its main weakness is moderate verbosity — the Mermaid diagram duplicates the prose explanations, and the Gap Analysis section includes explanatory context (possible causes for each category) that adds length without proportional value. The progressive disclosure could be improved by extracting the detailed gap analysis format into a separate reference file.
Suggestions
Remove the Mermaid diagram or the duplicative prose explanations below it — having both is redundant and costs significant tokens.
Trim the Gap Analysis category explanations to just the category name and flag label, removing the 'Possible causes' sentences which explain reasoning Claude can infer.
Consider extracting the Gap Analysis section into a separate GAP-ANALYSIS.md reference file, keeping only a brief summary and link in the main SKILL.md.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The content is mostly efficient but includes some unnecessary explanation — e.g., the opening paragraph explaining what description drift is and why orchestrators care is context Claude can infer. The Mermaid diagram, while informative, duplicates the textual explanations that follow it, adding redundancy. The Gap Analysis section's explanations of possible causes for each category add bulk without adding actionability. | 2 / 3 |
Actionability | The skill provides fully executable bash commands and node invocations at every step, with concrete script paths, flags, and expected outputs. The workflow steps are copy-paste ready, including the one-liner for looking up agent data by name and the exact dump command with file path. | 3 / 3 |
Workflow Clarity | The multi-step workflows are clearly sequenced with numbered steps, explicit branching for agents with/without Bash access, and a clear decision flowchart for choosing invocation mode. The workflow includes validation checkpoints (waiting for all Tasks to complete before dumping, checking for agent-map.json existence) and handles the error case of agents lacking Bash access with a fallback path. | 3 / 3 |
Progressive Disclosure | The skill references external files (template, scripts) appropriately, but the content itself is quite long with the Mermaid diagram, two mode explanations, full experiment workflow, single-agent workflow, lookup instructions, and detailed gap analysis all inline. The Gap Analysis section with its three categories and output format could reasonably be split into a separate reference file. However, no bundle files were provided to verify the referenced paths exist. | 2 / 3 |
Total | 10 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
b9f32ec
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.