Four-skill presentation system: ingest talks into a rhetoric vault, run interactive clarification, generate a speaker profile, then create new presentations that match your documented patterns. Includes an 88-entry Presentation Patterns taxonomy for scoring, brainstorming, and go-live preparation.
96
93%
Does it follow best practices?
Impact
97%
1.21xAverage score across 30 eval scenarios
Advisory
Suggest reviewing before use
Build a rhetoric and style knowledge base by analyzing presentation talks. Each run
processes unprocessed talks, extracts rhetoric/style observations, and updates the
running summary. The vault lives at ~/.claude/rhetoric-knowledge-vault/ (may be a
symlink to a custom location). All paths are relative to this vault root.
| File / Reference | Purpose |
|---|---|
tracking-database.json | Source of truth — talks, status, config, confirmed intents |
rhetoric-style-summary.md | Running rhetoric & style narrative |
slide-design-spec.md | Visual design rules from PDF + PPTX analysis |
speaker-profile.json | Machine-readable bridge to presentation-creator |
analyses/{talk_filename}.md | Per-talk rhetoric analysis (one file per processed talk) |
transcripts/{youtube_id}.txt | Downloaded/cleaned transcripts |
slides/{id}.pdf | Slide PDFs (from Google Drive, PPTX export, or video extraction) |
| references/schemas-db.md | DB + subagent schemas; extraction output schemas |
| references/rhetoric-dimensions.md | 14 analysis dimensions |
| references/video-slide-extraction.md | Video-to-slides pipeline — layout heuristics, tuning, limitations |
| references/processing-rules.md | Language policy, pattern migration logic, structured field rules |
scripts/pptx-extraction.py | Extract visual design data from .pptx files |
scripts/video-slide-extraction.py | Extract slides from video via ffmpeg + perceptual dedup |
scripts/batch-download-videos.sh | Parallel video download for batch processing |
scripts/vtt-cleanup.py | Clean VTT subtitles into plain transcript text |
A talk is processable when it has video_url. Slide sources, in order of preference:
pptx_path — richest data (exact colors, fonts, shapes via python-pptx)slides_url — download PDF from Google Drivevideo_url — extract slides from the video using ffmpeg + perceptual dedupprocessed_partial)The slide_source field tracks which path: "pptx", "pdf", "both",
"video_extracted", or "none". The pptx_catalog array fuzzy-matches .pptx
files to shownotes entries.
Vault discovery — canonical path is always ~/.claude/rhetoric-knowledge-vault/.
vault_root, read tracking-database.json.AskUserQuestion,
create directory (and symlink if custom path chosen), initialize empty
tracking-database.json with empty config, talks, pptx_catalog.Config bootstrapping — ask once per missing field and persist to the tracking
database. Core fields: talks_source_dir, pptx_source_dir, python_path,
template_skip_patterns. See references/schemas-db.md for the full schema.
Scan for new talks: Glob *.md in talks_source_dir; parse and add any file not
yet in talks[] (title, conference, date, URLs, status "pending"). Extract
video_url, slides_url from frontmatter/links. Parse IDs from URLs:
youtube_id: extract the v= parameter from YouTube URLs
(e.g., https://www.youtube.com/watch?v=aBcDeFg → youtube_id: "aBcDeFg")google_drive_id: extract the file ID from Google Drive URLs
(e.g., https://drive.google.com/file/d/1AbCdEfGhIjK/view → google_drive_id: "1AbCdEfGhIjK")Default status is always "pending" for new entries.
Scan for .pptx files: Recursively glob **/*.pptx in pptx_source_dir; fuzzy-match
to talks[] entries. Report counts. See references/schemas-db.md
for the PPTX extraction output schema (per-slide visual data, shape types, global design stats).
Run scripts/pptx-extraction.py for extraction.
Pattern taxonomy migration: See references/processing-rules.md for migration
logic. In brief: talks with status "processed" or "processed_partial" that
lack pattern_observations are marked "needs-reprocessing".
Read rhetoric-style-summary.md and slide-design-spec.md. Report:
"X processed, Y remaining. PPTX: A cataloged, B matched, C extracted."
pending or needs-reprocessing.slide_source per the hierarchy above. Mark "skipped_no_sources" only if
video_url is entirely absent — a talk with no video_url is not processable
regardless of whether slides exist.$ARGUMENTS specifies a talk filename or title, process ONLY that one.Per batch: launch 5 subagents in parallel, wait, run Step 4, then next batch.
Each subagent receives the talk's DB entry and current rhetoric-style-summary.md.
A. Download transcript and acquire slides.
Transcript download:
yt-dlp --write-auto-sub --sub-lang "en,ru,he,fr,de,es,ja" --skip-download --sub-format vtt \
-o "{vault_root}/transcripts/{youtube_id}" "https://www.youtube.com/watch?v={youtube_id}"{youtube_id}.ru.vtt). Record
the detected language as delivery_language. After download, clean the VTT:
python3 scripts/vtt-cleanup.py "{vault_root}/transcripts/{youtube_id}.{lang}.vtt"transcripts/{youtube_id}.txt."{python_path}" -c "
from youtube_transcript_api import YouTubeTranscriptApi
transcript = YouTubeTranscriptApi.get_transcript('{youtube_id}', languages=['en','ru','he','fr','de'])
for entry in transcript:
print(entry['text'])
" > "{vault_root}/transcripts/{youtube_id}.txt"ffmpeg -i "{vault_root}/slides-rebuild/{youtube_id}/{youtube_id}.mp4" \
-vn -acodec libmp3lame "{vault_root}/slides-rebuild/{youtube_id}/{youtube_id}.mp3"mlx_whisper.transcribe()) or OpenAI Whisper.
Set transcript_source: "whisper".processed_partial
if audio fails.Slide acquisition per slide_source:
pptx/both: run scripts/pptx-extraction.py <path.pptx>.pdf: download via gdown:
"{python_path}" -m gdown "https://drive.google.com/uc?id={google_drive_id}" \
-O "{vault_root}/slides/{google_drive_id}.pdf"video_extracted: download video at 720p then run scripts/video-slide-extraction.py:
yt-dlp -f "bestvideo[height<=720][ext=mp4]+bestaudio[ext=m4a]/best[height<=720][ext=mp4]/best[height<=720]" \
--merge-output-format mp4 \
-o "{vault_root}/slides-rebuild/{youtube_id}/{youtube_id}.mp4" \
"https://www.youtube.com/watch?v={youtube_id}"
python3 scripts/video-slide-extraction.py \
"{vault_root}/slides-rebuild/{youtube_id}/{youtube_id}.mp4" \
"{vault_root}/slides-rebuild/{youtube_id}" "{youtube_id}"slides/{youtube_id}.pdf. Delete video after extraction.
For batch downloads, use scripts/batch-download-videos.sh <vault_root> ID1 ID2 ....none: transcript-only, processed_partial.video_url exists, fall back to video
extraction. A talk can still reach "processed" status this way.Set transcript_source on the talk entry: youtube_auto (yt-dlp captions),
whisper (local transcription), or manual. This field is required — downstream
tools use it to gauge transcript reliability.
B. Analyze for Rhetoric & Style (NOT content). Apply all 14 dimensions from
references/rhetoric-dimensions.md (including dimension 14: Areas for Improvement).
Follow language policy and verbatim-quote rules in references/processing-rules.md.
Key rule: all verbatim quotes must be English-first — "English translation" (original text). Never quote non-English text without an English translation preceding it.
B2. Tag Presentation Patterns. Scan observations against the pattern taxonomy
at skills/presentation-creator/references/patterns/_index.md. Skip patterns
marked observable: false. Record confidence (strong/moderate/weak) and evidence per
pattern. Compute per-talk score: count(patterns) − count(antipatterns). Store in
pattern_observations. See references/processing-rules.md for full tagging rules.
C. Return JSON per the subagent return schema in references/schemas-db.md. Minimal structure:
{
"talk_id": "...",
"status": "processed",
"transcript_source": "youtube",
"slide_source": "pdf",
"pattern_observations": [
{"pattern_id": "...", "confidence": "strong", "evidence": "..."}
],
"new_patterns": ["..."],
"summary_updates": [{"section": 1, "content": "..."}],
"structured_data": {"delivery_language": "en", "co_presenter": false}
}Process PPTX files not yet extracted during Step 3: unmatched catalog entries, talks
that used PDF as primary but have a PPTX available, or entries with
pptx_visual_status: "pending". Skip if already "extracted".
Run scripts/pptx-extraction.py <path.pptx> for each file.
PPTX matching rules: The .pptx files are in Conference/Year/TalkName.pptx and
shownotes entries have conference and title fields. Fuzzy-match by: normalize
conference names (strip year, "Days", "Conference"), match by date proximity and title
substring. Skip files with "static" in name, conflict copies matching (N).pptx, and
files matching config.template_skip_patterns. Some talks have multiple .pptx files
(one per delivery) — match to the closest date.
After 3+ extractions, populate slide-design-spec.md; after 5+, analyze cross-talk
patterns (colors, fonts, footers).
After each batch:
status, processed_date, all result fields.
Persist pattern_observations IDs + score. Populate structured fields
(co_presenter, delivery_language, etc.) — do not leave structured data buried
in free-text prose. See references/processing-rules.md for field extraction rules.{vault_root}/analyses/{talk_filename}.md for each processed talk: all 14
dimensions, structured data, verbatim examples, and a "Presentation Patterns
Scoring" section. Create analyses/ directory if missing.new_patterns and
summary_updates. Sections 1–14 map to the 14 dimensions; Section 15 aggregates
improvement areas; Section 16 captures speaker-confirmed intent. Recount status
from the DB every time — never increment manually.| Transcript | Slides (PPTX/PDF) | Video | Status | Action |
|---|---|---|---|---|
| OK | OK | — | processed | Full analysis |
| OK | FAIL | OK | processed | Extract slides from video, then full analysis |
| OK | FAIL | FAIL | processed_partial | Transcript only (no visual analysis) |
| FAIL | OK | — | processed_partial | Slides only |
| FAIL | FAIL | OK | processed_partial | Extract slides from video, visual only |
| FAIL | FAIL | FAIL | skipped_download_failed | Skip, move on |
transcripts/, slides/, analyses/ dirs if missing.speaker-profile.json exists, suggest running the
vault-profile skill to regenerate it with updated data.--threshold to 14-16, manually specifying
slide_region crop coordinates, or accepting the bloated PDF and having the analysis
subagent sample frames at intervals. Best results with fullscreen slide recordings
(Devoxx, JFokus); worst with meetup/DevOpsDays audience-camera recordings.transcript_source: "whisper", cross-reference against visible slide text, and note
quality issues (e.g., transcript_quality: "partial").is_baruch_talk: false
and set status to skipped if the speaker doesn't match.evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
rules
skills
presentation-creator
references
patterns
build
deliver
prepare
scripts
vault-clarification
vault-ingress
vault-profile