0
# Video Downloads and Metadata
1
2
Core functionality for downloading individual YouTube videos and extracting comprehensive metadata including titles, descriptions, view counts, thumbnails, and publication information.
3
4
## Capabilities
5
6
### YouTube Class
7
8
Primary interface for single video operations providing access to video metadata, stream collections, and download capabilities.
9
10
```python { .api }
11
class YouTube:
12
def __init__(
13
self,
14
url: str,
15
on_progress_callback: Optional[Callable[[Any, bytes, int], None]] = None,
16
on_complete_callback: Optional[Callable[[Any, Optional[str]], None]] = None,
17
proxies: Dict[str, str] = None,
18
use_oauth: bool = False,
19
allow_oauth_cache: bool = True
20
):
21
"""
22
Construct a YouTube object for a single video.
23
24
Args:
25
url (str): A valid YouTube watch URL
26
on_progress_callback (callable, optional): User defined callback function for stream download progress events
27
on_complete_callback (callable, optional): User defined callback function for stream download complete events
28
proxies (dict, optional): A dict mapping protocol to proxy address which will be used by pytube
29
use_oauth (bool, optional): Prompt the user to authenticate to YouTube
30
allow_oauth_cache (bool, optional): Cache OAuth tokens locally on the machine
31
"""
32
```
33
34
### Video Identification
35
36
Access video identification and URL properties.
37
38
```python { .api }
39
@property
40
def video_id(self) -> str:
41
"""Get the video ID extracted from the URL."""
42
43
@property
44
def watch_url(self) -> str:
45
"""Get the full YouTube watch URL."""
46
47
@property
48
def embed_url(self) -> str:
49
"""Get the YouTube embed URL."""
50
51
@property
52
def thumbnail_url(self) -> str:
53
"""Get the video thumbnail URL."""
54
```
55
56
### Video Metadata
57
58
Extract comprehensive video metadata including title, description, author, and publication details.
59
60
```python { .api }
61
@property
62
def title(self) -> str:
63
"""Get the video title."""
64
65
@title.setter
66
def title(self, value: str):
67
"""Set the video title."""
68
69
@property
70
def description(self) -> str:
71
"""Get the video description."""
72
73
@property
74
def author(self) -> str:
75
"""Get the video author/channel name."""
76
77
@author.setter
78
def author(self, value: str):
79
"""Set the video author."""
80
81
@property
82
def publish_date(self) -> datetime:
83
"""Get the video publish date."""
84
85
@publish_date.setter
86
def publish_date(self, value: datetime):
87
"""Set the video publish date."""
88
89
@property
90
def keywords(self) -> List[str]:
91
"""Get the video keywords/tags."""
92
```
93
94
### Video Statistics
95
96
Access view counts, ratings, and duration information.
97
98
```python { .api }
99
@property
100
def views(self) -> int:
101
"""Get the number of times the video has been viewed."""
102
103
@property
104
def rating(self) -> float:
105
"""Get the video average rating."""
106
107
@property
108
def length(self) -> int:
109
"""Get the video length in seconds."""
110
```
111
112
### Channel Information
113
114
Extract information about the video's channel.
115
116
```python { .api }
117
@property
118
def channel_id(self) -> str:
119
"""Get the video poster's channel ID."""
120
121
@property
122
def channel_url(self) -> str:
123
"""Get the channel URL for the video's poster."""
124
```
125
126
### Stream and Caption Access
127
128
Access downloadable streams and caption tracks.
129
130
```python { .api }
131
@property
132
def streams(self) -> StreamQuery:
133
"""Interface to query both adaptive (DASH) and progressive streams."""
134
135
@property
136
def captions(self) -> CaptionQuery:
137
"""Interface to query caption tracks."""
138
139
@property
140
def caption_tracks(self) -> List[Caption]:
141
"""Get a list of Caption objects."""
142
```
143
144
### Availability Checking
145
146
Verify video availability and handle various restriction scenarios.
147
148
```python { .api }
149
def check_availability(self) -> None:
150
"""
151
Check whether the video is available.
152
153
Raises different exceptions based on why the video is unavailable,
154
otherwise does nothing.
155
"""
156
157
@property
158
def age_restricted(self) -> bool:
159
"""Check if the video is age restricted."""
160
161
def bypass_age_gate(self) -> None:
162
"""Attempt to update the vid_info by bypassing the age gate."""
163
```
164
165
### Callback Management
166
167
Manage download progress and completion callbacks.
168
169
```python { .api }
170
def register_on_progress_callback(self, func: Callable[[Any, bytes, int], None]) -> None:
171
"""
172
Register a download progress callback function post initialization.
173
174
Args:
175
func (callable): A callback function that takes stream, chunk, and bytes_remaining as parameters
176
"""
177
178
def register_on_complete_callback(self, func: Callable[[Any, Optional[str]], None]) -> None:
179
"""
180
Register a download complete callback function post initialization.
181
182
Args:
183
func (callable): A callback function that takes stream and file_path
184
"""
185
```
186
187
### Static Methods
188
189
Create YouTube objects from video IDs.
190
191
```python { .api }
192
@staticmethod
193
def from_id(video_id: str) -> YouTube:
194
"""
195
Construct a YouTube object from a video ID.
196
197
Args:
198
video_id (str): The video ID of the YouTube video
199
200
Returns:
201
YouTube: YouTube object for the specified video
202
"""
203
```
204
205
### Extended Metadata
206
207
Access additional video metadata and technical information.
208
209
```python { .api }
210
@property
211
def metadata(self) -> Optional[YouTubeMetadata]:
212
"""Get the metadata for the video."""
213
214
@property
215
def watch_html(self) -> str:
216
"""Get the HTML content of the watch page."""
217
218
@property
219
def vid_info(self) -> Dict:
220
"""Parse the raw vid info and return the parsed result."""
221
222
@property
223
def initial_data(self) -> Dict:
224
"""Get the initial data from the watch page."""
225
226
@property
227
def embed_html(self) -> str:
228
"""Get the HTML content of the embed page."""
229
230
@property
231
def js_url(self) -> str:
232
"""Get the URL to the JavaScript file containing signature decryption functions."""
233
234
@property
235
def js(self) -> str:
236
"""Get the JavaScript content for signature decryption."""
237
238
@property
239
def streaming_data(self) -> Dict:
240
"""Get the streamingData from video info, bypassing age gate if necessary."""
241
242
@property
243
def fmt_streams(self) -> List[Stream]:
244
"""Get the list of initialized Stream objects if they have been processed."""
245
```
246
247
## Usage Examples
248
249
### Basic Video Download
250
251
```python
252
from pytube import YouTube
253
254
# Create YouTube object
255
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')
256
257
# Get basic information
258
print(f"Title: {yt.title}")
259
print(f"Author: {yt.author}")
260
print(f"Duration: {yt.length} seconds")
261
print(f"Views: {yt.views:,}")
262
263
# Download the highest quality progressive stream
264
stream = yt.streams.get_highest_resolution()
265
stream.download()
266
```
267
268
### Progress Tracking
269
270
```python
271
from pytube import YouTube
272
273
def progress_callback(stream, chunk, bytes_remaining):
274
total_size = stream.filesize
275
bytes_downloaded = total_size - bytes_remaining
276
percentage = bytes_downloaded / total_size * 100
277
print(f"Download progress: {percentage:.1f}%")
278
279
def complete_callback(stream, file_path):
280
print(f"Download completed! File saved to: {file_path}")
281
282
# Create YouTube object with callbacks
283
yt = YouTube(
284
'https://www.youtube.com/watch?v=9bZkp7q19f0',
285
on_progress_callback=progress_callback,
286
on_complete_callback=complete_callback
287
)
288
289
# Download with progress tracking
290
stream = yt.streams.get_highest_resolution()
291
stream.download()
292
```
293
294
### OAuth Authentication
295
296
```python
297
from pytube import YouTube
298
299
# Use OAuth for accessing private or restricted content
300
yt = YouTube(
301
'https://www.youtube.com/watch?v=PRIVATE_VIDEO_ID',
302
use_oauth=True,
303
allow_oauth_cache=True
304
)
305
306
# Download after authentication
307
stream = yt.streams.get_highest_resolution()
308
stream.download()
309
```
310
311
### Proxy Configuration
312
313
```python
314
from pytube import YouTube
315
316
# Configure proxy settings
317
proxies = {
318
'http': 'http://proxy.example.com:8080',
319
'https': 'https://proxy.example.com:8080'
320
}
321
322
yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0', proxies=proxies)
323
stream = yt.streams.get_highest_resolution()
324
stream.download()
325
```
326
327
## Types
328
329
```python { .api }
330
class YouTubeMetadata:
331
"""
332
Extended metadata information for YouTube videos.
333
334
Parses and organizes structured metadata from YouTube video pages,
335
including video categories, tags, descriptions, and other detailed information.
336
"""
337
338
def __init__(self, metadata: List):
339
"""
340
Initialize YouTubeMetadata with raw metadata list.
341
342
Args:
343
metadata (List): Raw metadata list from YouTube page
344
"""
345
346
def __getitem__(self, key: int) -> Dict:
347
"""
348
Get metadata section by index.
349
350
Args:
351
key (int): Index of metadata section
352
353
Returns:
354
Dict: Metadata section dictionary
355
"""
356
357
def __iter__(self) -> Iterator[Dict]:
358
"""
359
Iterate through metadata sections.
360
361
Returns:
362
Iterator[Dict]: Iterator over metadata section dictionaries
363
"""
364
365
def __str__(self) -> str:
366
"""
367
Get JSON string representation of metadata.
368
369
Returns:
370
str: JSON formatted metadata string
371
"""
372
373
@property
374
def raw_metadata(self) -> Optional[List]:
375
"""Get the raw unprocessed metadata list."""
376
377
@property
378
def metadata(self) -> List[Dict]:
379
"""Get the processed metadata as a list of dictionaries."""
380
381
# Callback types
382
ProgressCallback = Callable[[Any, bytes, int], None]
383
CompleteCallback = Callable[[Any, Optional[str]], None]
384
```