Summarize any video by analyzing both audio and visuals. Downloads via yt-dlp, extracts transcript (YouTube captions or Whisper), pulls scene-detected keyframes, and produces a multimodal summary with clickable timestamped YouTube links. Use this skill whenever the user wants to summarize a YouTube video, digest a talk or tutorial, get notes from a video, extract key points from a recording, or says things like "tl;dw", "summarize this video", "what's in this video", or pastes a YouTube URL and asks for a summary. Also triggers for non-YouTube URLs that yt-dlp supports.
94
94%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Security
2 findings — 2 medium severity. This skill can be installed but you should review these findings before use.
The skill exposes the agent to untrusted, user-generated content from public third-party sources, creating a risk of indirect prompt injection. This includes browsing arbitrary URLs, reading social media posts or forum comments, and analyzing content from unknown websites.
Third-party content exposure detected (high risk: 1.00). This skill explicitly downloads videos and metadata from public URLs via yt-dlp (SKILL.md Step 1), ingests untrusted YouTube captions or Whisper transcriptions (Step 2), and instructs parallel subagents to read and act on contact-sheet images and transcript segments (references/SUBAGENT-PROMPT.md), so external, user-generated content directly influences agent analysis and subsequent actions.
The skill fetches instructions or code from an external URL at runtime, and the fetched content directly controls the agent’s prompts or executes code. This dynamic dependency allows the external source to modify the agent’s behavior without any changes to the skill itself.
Potentially malicious external URL detected (high risk: 0.70). The skill will auto-install and run the external package whisper-ctranslate2 at runtime (via "uv tool install whisper-ctranslate2" or "pipx install whisper-ctranslate2"), which fetches and executes remote code from the Python package index (e.g. https://pypi.org/project/whisper-ctranslate2) when YouTube captions are unavailable, making it a runtime external dependency that executes remote code.