Tessl Tile for pypi/youtube-transcript-api@1.2.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-api.md data-structures.md error-handling.md formatters.md index.md proxy-config.md

index.mddocs/

0
# YouTube Transcript API
1

2
A Python API for retrieving YouTube video transcripts and subtitles without requiring browser automation. Supports manually created and automatically generated subtitles, transcript translation, multiple output formats, and proxy configuration for working around IP restrictions.
3

4
## Package Information
5

6
- **Package Name**: youtube-transcript-api
7
- **Language**: Python
8
- **Installation**: `pip install youtube-transcript-api`
9

10
## Core Imports
11

12
```python
13
from youtube_transcript_api import YouTubeTranscriptApi
14
```
15

16
For specific functionality:
17

18
```python
19
from youtube_transcript_api import (
20
    YouTubeTranscriptApi,
21
    TranscriptList,
22
    Transcript,
23
    FetchedTranscript,
24
    FetchedTranscriptSnippet,
25
    YouTubeTranscriptApiException
26
)
27
```
28

29
For formatters:
30

31
```python
32
from youtube_transcript_api.formatters import (
33
    JSONFormatter,
34
    TextFormatter,
35
    SRTFormatter,
36
    WebVTTFormatter,
37
    PrettyPrintFormatter,
38
    FormatterLoader
39
)
40
```
41

42
For proxy configuration:
43

44
```python
45
from youtube_transcript_api.proxies import (
46
    GenericProxyConfig,
47
    WebshareProxyConfig
48
)
49
```
50

51
## Basic Usage
52

53
```python
54
from youtube_transcript_api import YouTubeTranscriptApi
55

56
# Simple transcript fetch
57
api = YouTubeTranscriptApi()
58
transcript = api.fetch('video_id')
59

60
# Process transcript data
61
for snippet in transcript:
62
    print(f"{snippet.start}: {snippet.text}")
63

64
# Get list of available transcripts
65
transcript_list = api.list('video_id')
66
for t in transcript_list:
67
    print(f"{t.language_code}: {t.language} ({'generated' if t.is_generated else 'manual'})")
68

69
# Fetch specific language with fallback
70
transcript = transcript_list.find_transcript(['es', 'en'])
71
fetched = transcript.fetch()
72

73
# Translate transcript
74
translated = transcript.translate('fr')
75
french_transcript = translated.fetch()
76
```
77

78
## Architecture
79

80
The library uses a hierarchical structure for transcript management:
81

82
- **YouTubeTranscriptApi**: Main entry point for all operations
83
- **TranscriptList**: Container for all available transcripts for a video
84
- **Transcript**: Individual transcript metadata and fetching capabilities
85
- **FetchedTranscript**: Actual transcript content with timing information
86
- **FetchedTranscriptSnippet**: Individual text segments with timestamps
87

88
This design enables efficient discovery of available transcripts, flexible language selection with fallbacks, and lazy loading of transcript content only when needed.
89

90
## Capabilities
91

92
### Core API Functions
93

94
Main API class for retrieving transcripts with support for language selection, proxy configuration, and custom HTTP clients.
95

96
```python { .api }
97
class YouTubeTranscriptApi:
98
    def __init__(self, proxy_config=None, http_client=None): ...
99
    def fetch(self, video_id, languages=("en",), preserve_formatting=False): ...
100
    def list(self, video_id): ...
101
```
102

103
[Core API](./core-api.md)
104

105
### Transcript Data Structures
106

107
Data classes for representing transcript lists, individual transcripts, and fetched content with timing information.
108

109
```python { .api }
110
class TranscriptList:
111
    def find_transcript(self, language_codes): ...
112
    def find_generated_transcript(self, language_codes): ...
113
    def find_manually_created_transcript(self, language_codes): ...
114

115
class Transcript:
116
    def fetch(self, preserve_formatting=False): ...
117
    def translate(self, language_code): ...
118

119
class FetchedTranscript:
120
    def to_raw_data(self): ...
121
```
122

123
[Data Structures](./data-structures.md)
124

125
### Output Formatters
126

127
Classes for converting transcript data into various output formats including JSON, plain text, SRT subtitles, and WebVTT.
128

129
```python { .api }
130
class JSONFormatter:
131
    def format_transcript(self, transcript, **kwargs): ...
132
    def format_transcripts(self, transcripts, **kwargs): ...
133

134
class SRTFormatter:
135
    def format_transcript(self, transcript, **kwargs): ...
136

137
class WebVTTFormatter:
138
    def format_transcript(self, transcript, **kwargs): ...
139
```
140

141
[Formatters](./formatters.md)
142

143
### Proxy Configuration
144

145
Classes for configuring HTTP proxies to work around IP blocking, including generic proxy support and specialized Webshare residential proxy integration.
146

147
```python { .api }
148
class GenericProxyConfig:
149
    def __init__(self, http_url=None, https_url=None): ...
150

151
class WebshareProxyConfig:
152
    def __init__(self, proxy_username, proxy_password, **kwargs): ...
153
```
154

155
[Proxy Configuration](./proxy-config.md)
156

157
### Error Handling
158

159
Comprehensive exception hierarchy for handling all error scenarios including video unavailability, IP blocking, missing transcripts, and translation errors.
160

161
```python { .api }
162
class YouTubeTranscriptApiException(Exception): ...
163
class CouldNotRetrieveTranscript(YouTubeTranscriptApiException): ...
164
class VideoUnavailable(CouldNotRetrieveTranscript): ...
165
class TranscriptsDisabled(CouldNotRetrieveTranscript): ...
166
class NoTranscriptFound(CouldNotRetrieveTranscript): ...
167
```
168

169
[Error Handling](./error-handling.md)

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/