0
# Caption and Subtitle Support
1
2
Caption track extraction and conversion to .srt format with support for multiple languages and automatic subtitle generation from YouTube videos.
3
4
## Capabilities
5
6
### Caption Class
7
8
Represents an individual caption track with language-specific subtitle data and conversion capabilities.
9
10
```python { .api }
11
class Caption:
12
def __init__(self, caption_track: Dict):
13
"""
14
Initialize a Caption object.
15
16
Args:
17
caption_track (dict): Caption track metadata dictionary
18
"""
19
```
20
21
### Caption Properties
22
23
Access caption track information and content.
24
25
```python { .api }
26
@property
27
def url(self) -> str:
28
"""Get the URL for downloading the caption track."""
29
30
@property
31
def name(self) -> str:
32
"""Get the human-readable name of the caption track (e.g., 'English', 'Spanish')."""
33
34
@property
35
def code(self) -> str:
36
"""Get the language code for the caption track (e.g., 'en', 'es', 'fr')."""
37
38
@property
39
def xml_captions(self) -> str:
40
"""Get the raw XML caption data from YouTube."""
41
42
@property
43
def json_captions(self) -> dict:
44
"""Get the parsed JSON caption data."""
45
```
46
47
### Caption Conversion
48
49
Convert caption data between formats.
50
51
```python { .api }
52
def generate_srt_captions(self) -> str:
53
"""
54
Convert the caption track to SRT (SubRip) format.
55
56
Returns:
57
str: Caption content in SRT format with timestamps and text
58
"""
59
```
60
61
### Caption Download
62
63
Download caption files with various format options.
64
65
```python { .api }
66
def download(
67
self,
68
title: str,
69
srt: bool = True,
70
output_path: Optional[str] = None,
71
filename_prefix: Optional[str] = None
72
) -> str:
73
"""
74
Download the caption track to a file.
75
76
Args:
77
title (str): Base filename for the caption file
78
srt (bool): Convert to SRT format (default: True)
79
output_path (str, optional): Directory to save the file
80
filename_prefix (str, optional): Prefix to add to filename
81
82
Returns:
83
str: Path to the downloaded caption file
84
"""
85
```
86
87
### Static Caption Utilities
88
89
Utility methods for caption format conversion.
90
91
```python { .api }
92
@staticmethod
93
def float_to_srt_time_format(d: float) -> str:
94
"""
95
Convert a float timestamp to SRT time format.
96
97
Args:
98
d (float): Time in seconds as a float
99
100
Returns:
101
str: Time in SRT format (HH:MM:SS,mmm)
102
"""
103
104
@staticmethod
105
def xml_caption_to_srt(xml_captions: str) -> str:
106
"""
107
Convert XML caption data to SRT format.
108
109
Args:
110
xml_captions (str): Raw XML caption content
111
112
Returns:
113
str: Caption content converted to SRT format
114
"""
115
```
116
117
### CaptionQuery Class
118
119
Query interface for caption collections providing dictionary-like access to caption tracks by language code.
120
121
```python { .api }
122
class CaptionQuery:
123
def __init__(self, captions: List[Caption]):
124
"""
125
Initialize CaptionQuery with a list of caption tracks.
126
127
Args:
128
captions (List[Caption]): List of available caption tracks
129
"""
130
```
131
132
### Caption Access
133
134
Access caption tracks by language code and iterate through available captions.
135
136
```python { .api }
137
def __getitem__(self, lang_code: str) -> Caption:
138
"""
139
Get caption track by language code.
140
141
Args:
142
lang_code (str): Language code (e.g., 'en', 'es', 'fr')
143
144
Returns:
145
Caption: Caption track for the specified language
146
147
Raises:
148
KeyError: If language code is not found
149
"""
150
151
def __len__(self) -> int:
152
"""
153
Get the number of available caption tracks.
154
155
Returns:
156
int: Number of caption tracks
157
"""
158
159
def __iter__(self) -> Iterator[Caption]:
160
"""
161
Iterate through all available caption tracks.
162
163
Returns:
164
Iterator[Caption]: Iterator over caption tracks
165
"""
166
167
### Deprecated Methods
168
169
Legacy methods maintained for backward compatibility.
170
171
```python { .api }
172
def get_by_language_code(self, lang_code: str) -> Optional[Caption]:
173
"""
174
Get caption track by language code.
175
176
**DEPRECATED**: Use dictionary-style access with captions[lang_code] instead.
177
178
Args:
179
lang_code (str): Language code (e.g., 'en', 'es')
180
181
Returns:
182
Caption or None: Caption track for the specified language
183
"""
184
185
def all(self) -> List[Caption]:
186
"""
187
Get all the results represented by this query as a list.
188
189
**DEPRECATED**: CaptionQuery can be treated as a dictionary/iterable directly.
190
191
Returns:
192
List[Caption]: All caption tracks
193
"""
194
```
195
196
## Usage Examples
197
198
### Basic Caption Download
199
200
```python
201
from pytube import YouTube
202
203
# Get video with captions
204
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
205
206
# Check available caption tracks
207
print("Available captions:")
208
for caption in yt.captions:
209
print(f"- {caption.name} ({caption.code})")
210
211
# Download English captions
212
if 'en' in yt.captions:
213
caption = yt.captions['en']
214
caption.download(title=yt.title)
215
print(f"Downloaded captions: {caption.name}")
216
```
217
218
### SRT Format Conversion
219
220
```python
221
from pytube import YouTube
222
223
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
224
225
# Get English captions and convert to SRT
226
if 'en' in yt.captions:
227
caption = yt.captions['en']
228
229
# Generate SRT content
230
srt_content = caption.generate_srt_captions()
231
232
# Save to custom file
233
with open('custom_captions.srt', 'w', encoding='utf-8') as f:
234
f.write(srt_content)
235
236
print("SRT file created: custom_captions.srt")
237
```
238
239
### Multiple Language Downloads
240
241
```python
242
from pytube import YouTube
243
import os
244
245
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
246
247
# Create captions directory
248
captions_dir = "captions"
249
os.makedirs(captions_dir, exist_ok=True)
250
251
# Download all available caption tracks
252
for caption in yt.captions:
253
try:
254
file_path = caption.download(
255
title=yt.title,
256
output_path=captions_dir,
257
filename_prefix=f"{caption.code}_"
258
)
259
print(f"Downloaded {caption.name}: {file_path}")
260
except Exception as e:
261
print(f"Failed to download {caption.name}: {e}")
262
```
263
264
### Caption Content Analysis
265
266
```python
267
from pytube import YouTube
268
269
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
270
271
if 'en' in yt.captions:
272
caption = yt.captions['en']
273
274
# Get raw caption data
275
xml_data = caption.xml_captions
276
json_data = caption.json_captions
277
278
print(f"XML data length: {len(xml_data)} characters")
279
print(f"JSON entries: {len(json_data.get('events', []))}")
280
281
# Convert to SRT and analyze
282
srt_content = caption.generate_srt_captions()
283
srt_lines = srt_content.split('\n')
284
subtitle_count = srt_content.count('\n\n') + 1
285
286
print(f"SRT content: {len(srt_lines)} lines")
287
print(f"Number of subtitles: {subtitle_count}")
288
```
289
290
### Custom SRT Processing
291
292
```python
293
from pytube import YouTube
294
import re
295
296
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
297
298
if 'en' in yt.captions:
299
caption = yt.captions['en']
300
srt_content = caption.generate_srt_captions()
301
302
# Extract all subtitle text (remove timestamps and numbering)
303
subtitle_pattern = r'\d+\n\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}\n(.+?)(?=\n\n|\n\d+\n|\Z)'
304
matches = re.findall(subtitle_pattern, srt_content, re.DOTALL)
305
306
all_text = ' '.join(match.replace('\n', ' ') for match in matches)
307
print(f"Full transcript: {all_text[:200]}...")
308
```
309
310
### Error Handling
311
312
```python
313
from pytube import YouTube
314
315
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
316
317
# Check if captions are available
318
if len(yt.captions) == 0:
319
print("No captions available for this video")
320
else:
321
print(f"Found {len(yt.captions)} caption tracks")
322
323
# Try to get specific language with fallback
324
preferred_languages = ['en', 'en-US', 'en-GB']
325
326
selected_caption = None
327
for lang in preferred_languages:
328
if lang in yt.captions:
329
selected_caption = yt.captions[lang]
330
break
331
332
if selected_caption:
333
try:
334
selected_caption.download(title=yt.title)
335
print(f"Downloaded captions: {selected_caption.name}")
336
except Exception as e:
337
print(f"Download failed: {e}")
338
else:
339
# Fall back to first available caption
340
first_caption = next(iter(yt.captions))
341
print(f"Using fallback caption: {first_caption.name}")
342
first_caption.download(title=yt.title)
343
```
344
345
### Time-based Caption Extraction
346
347
```python
348
from pytube import YouTube
349
import json
350
351
def extract_captions_for_timerange(caption, start_seconds, end_seconds):
352
"""Extract captions for a specific time range."""
353
json_data = caption.json_captions
354
events = json_data.get('events', [])
355
356
selected_captions = []
357
for event in events:
358
if 'tStartMs' in event and 'dDurationMs' in event:
359
start_ms = event['tStartMs']
360
duration_ms = event['dDurationMs']
361
start_time = start_ms / 1000
362
end_time = (start_ms + duration_ms) / 1000
363
364
# Check if this caption overlaps with our time range
365
if start_time < end_seconds and end_time > start_seconds:
366
if 'segs' in event:
367
text = ''.join(seg.get('utf8', '') for seg in event['segs'])
368
selected_captions.append({
369
'start': start_time,
370
'end': end_time,
371
'text': text.strip()
372
})
373
374
return selected_captions
375
376
# Usage
377
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
378
if 'en' in yt.captions:
379
caption = yt.captions['en']
380
381
# Get captions for first 60 seconds
382
timerange_captions = extract_captions_for_timerange(caption, 0, 60)
383
384
for cap in timerange_captions:
385
print(f"{cap['start']:.1f}s - {cap['end']:.1f}s: {cap['text']}")
386
```
387
388
## Types
389
390
```python { .api }
391
from typing import Dict, List, Optional, Iterator
392
393
# Caption track metadata structure
394
CaptionTrackDict = Dict[str, Any]
395
396
# JSON caption event structure
397
CaptionEvent = Dict[str, Any]
398
```