Use when the user wants to narrate a demo, add voice-over to a screen recording, or create AI narration for a silent video. End-to-end pipeline that extracts frames, analyzes with parallel subagents, writes a word-budgeted voice-over script, generates TTS audio per act, and merges everything back.
94
94%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Quality
Discovery
100%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is an excellent skill description that clearly communicates a specific, well-defined pipeline for adding AI narration to videos. It opens with an explicit 'Use when' clause containing natural trigger terms, then describes the concrete end-to-end workflow. The description is distinctive enough to avoid conflicts with adjacent skills while being comprehensive about both purpose and trigger conditions.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: extracts frames, analyzes with parallel subagents, writes a word-budgeted voice-over script, generates TTS audio per act, and merges everything back. These are detailed, concrete pipeline steps. | 3 / 3 |
Completeness | Clearly answers both 'what' (end-to-end pipeline that extracts frames, analyzes, writes scripts, generates TTS, merges) and 'when' (explicit 'Use when' clause covering demo narration, voice-over for screen recordings, AI narration for silent videos). | 3 / 3 |
Trigger Term Quality | Includes strong natural trigger terms users would say: 'narrate a demo', 'voice-over', 'screen recording', 'AI narration', 'silent video', 'TTS audio'. These cover multiple natural phrasings a user might use. | 3 / 3 |
Distinctiveness Conflict Risk | Highly distinctive niche combining video processing with voice-over/narration generation. The specific pipeline (frame extraction → analysis → script writing → TTS → merging) is unlikely to conflict with other skills like generic video editing or text-to-speech alone. | 3 / 3 |
Total | 12 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-crafted skill for a complex multi-step pipeline. It excels in actionability with executable commands at every step, clear workflow sequencing with validation checkpoints and error recovery paths, and appropriate progressive disclosure to a reference guide. The main weakness is moderate verbosity — some explanations could be trimmed without losing clarity, particularly around script location fallbacks and editorial advice that Claude doesn't need.
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly long (~300 lines) but most content is genuinely needed for a complex multi-step pipeline. Some areas could be tightened — the script-locating fallback block is verbose, and explanations like 'changing text is free, re-generating audio costs time' are unnecessary for Claude. The word-budget math section is well-structured but slightly over-explained. | 2 / 3 |
Actionability | Every step has concrete, executable bash commands with specific arguments. The timing.txt format is precisely defined with examples, word budget calculations are explicit with formulas, and the TTS rate-fitting behavior is documented with exact thresholds (+15%, +3% increments). The subagent prompt template is copy-paste ready. | 3 / 3 |
Workflow Clarity | The 8-step pipeline is clearly sequenced with explicit validation checkpoints: dependency checking (Step 0), dry-run budget verification (Step 4c), user approval gate (Step 4f), LONG/OK/RATE status labels for TTS output (Step 5), and ffprobe verification of final output (Step 7). Error recovery is addressed — failed subagents get re-run, LONG acts get trimmed and re-run. The fade-in conditional is cleanly handled throughout. | 3 / 3 |
Progressive Disclosure | The skill provides a clear pipeline overview at the top, detailed steps inline (which is appropriate given this is the main workflow), and defers the full command reference, voice options, and installation instructions to references/REFERENCE.md. References are one level deep and clearly signaled. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
100%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 11 / 11 Passed
Validation for skill structure
No warnings or errors.
Reviewed
Table of Contents