Fetch transcripts and create structured watching guides / summaries for YouTube videos. Use when the user asks to (1) get, fetch, extract, or download a transcript or captions from a YouTube video URL, or (2) summarize, create a watching guide, or produce a structured summary of a YouTube video. Trigger on youtube.com/watch, youtu.be, or youtube.com/shorts links.
79
—
Does it follow best practices?
Impact
—
No eval scenarios have been run
Advisory
Suggest reviewing before use
Fetch YouTube video transcripts and produce structured watching guides via CLI tools.
Default approach:
npx defuddle parse <youtube-url> --markdownAlways use --markdown to get the transcript in readable Markdown format.
--lang <code> — request a specific language (BCP 47, e.g. en, ja, zh).-o <file> — write output to a file instead of stdout.defuddle output includes a link to the video, the video description, and timestamped transcript segments.Use this alternative approach when defuddle failed to provide useful results:
yt-dlp binary in the user's PATH if available.yt-dlp is unavailable but uvx is available, use uvx yt-dlp.yt-dlp nor uvx is available.if command -v yt-dlp >/dev/null 2>&1; then
yt-dlp --write-auto-subs --sub-langs en --skip-download -o "/tmp/video" "<YOUTUBE_VIDEO_URL>" 2>&1
elif command -v uvx >/dev/null 2>&1; then
uvx yt-dlp --write-auto-subs --sub-langs en --skip-download -o "/tmp/video" "<YOUTUBE_VIDEO_URL>" 2>&1
fiThis writes subtitle files such as /tmp/video.en.vtt; read the generated file for the transcript.
A watching guide is a structured summary that breaks a video into segments with summaries and highlighted quotes. Use this workflow when the user asks to summarize a video or create a watching guide.
watching-guide.md, or as the user specifies).Convert transcript timestamps (MM:SS or HH:MM:SS) to YouTube deep links:
https://www.youtube.com/watch?v=VIDEO_ID&t=TOTAL_SECONDSsExample: 0:09:57 → &t=597s (9×60 + 57 = 597).
# Watching guide — [Video Title]
*[Brief description, e.g. "Host interviews Guest"] · ~[duration] · [video link]*
---
### `00:00 – 02:00` [Segment Title]
[► 0:00](<video-link>&t=0s)
[2–4 sentence summary of this segment.]
**Highlights**
> *"Verbatim quote from the transcript."*
> [0:01:30](<video-link>&t=90s)
> *"Another notable quote."*
> [0:01:55](<video-link>&t=115s)
---
### `02:00 – 10:30` [Next Segment Title]
[► 2:00](<video-link>&t=120s)
[Summary paragraph.]
**Highlights**
> *"Key quote."*
> [0:05:12](<video-link>&t=312s)b0c328d
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.