Tessl Tile for pypi/youtube-transcript-api@1.2.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

core-api.md data-structures.md error-handling.md formatters.md index.md proxy-config.md

formatters.mddocs/

0
# Output Formatters
1

2
Classes for converting transcript data into various output formats. Supports JSON, plain text, SRT subtitles, WebVTT, and pretty-printed formats for different use cases.
3

4
## Capabilities
5

6
### Base Formatter Class
7

8
Abstract base class defining the formatter interface. All concrete formatters inherit from this class.
9

10
```python { .api }
11
class Formatter:
12
    def format_transcript(self, transcript, **kwargs):
13
        """
14
        Format a single transcript.
15

16
        Args:
17
            transcript (FetchedTranscript): Transcript to format
18
            **kwargs: Formatter-specific options
19

20
        Returns:
21
            str: Formatted transcript string
22

23
        Raises:
24
            NotImplementedError: Must be implemented by subclasses
25
        """
26

27
    def format_transcripts(self, transcripts, **kwargs):
28
        """
29
        Format multiple transcripts.
30

31
        Args:
32
            transcripts (List[FetchedTranscript]): Transcripts to format
33
            **kwargs: Formatter-specific options
34

35
        Returns:
36
            str: Formatted transcripts string
37

38
        Raises:
39
            NotImplementedError: Must be implemented by subclasses
40
        """
41
```
42

43
### JSON Formatter
44

45
Converts transcript data to JSON format for programmatic processing and data interchange.
46

47
```python { .api }
48
class JSONFormatter(Formatter):
49
    def format_transcript(self, transcript, **kwargs):
50
        """
51
        Convert transcript to JSON string.
52

53
        Args:
54
            transcript (FetchedTranscript): Transcript to format
55
            **kwargs: Passed to json.dumps() (indent, ensure_ascii, etc.)
56

57
        Returns:
58
            str: JSON representation of transcript data
59
        """
60

61
    def format_transcripts(self, transcripts, **kwargs):
62
        """
63
        Convert multiple transcripts to JSON array string.
64

65
        Args:
66
            transcripts (List[FetchedTranscript]): Transcripts to format
67
            **kwargs: Passed to json.dumps()
68

69
        Returns:
70
            str: JSON array of transcript data
71
        """
72
```
73

74
### Text Formatter
75

76
Converts transcripts to plain text with no timestamps. Useful for text analysis and content extraction.
77

78
```python { .api }
79
class TextFormatter(Formatter):
80
    def format_transcript(self, transcript, **kwargs):
81
        """
82
        Convert transcript to plain text (no timestamps).
83

84
        Args:
85
            transcript (FetchedTranscript): Transcript to format
86
            **kwargs: Unused
87

88
        Returns:
89
            str: Plain text with lines separated by newlines
90
        """
91

92
    def format_transcripts(self, transcripts, **kwargs):
93
        """
94
        Convert multiple transcripts to plain text.
95

96
        Args:
97
            transcripts (List[FetchedTranscript]): Transcripts to format
98
            **kwargs: Unused
99

100
        Returns:
101
            str: Plain text with transcripts separated by triple newlines
102
        """
103
```
104

105
### Pretty Print Formatter
106

107
Human-readable formatted output using Python's pprint module for debugging and inspection.
108

109
```python { .api }
110
class PrettyPrintFormatter(Formatter):
111
    def format_transcript(self, transcript, **kwargs):
112
        """
113
        Pretty print transcript data.
114

115
        Args:
116
            transcript (FetchedTranscript): Transcript to format
117
            **kwargs: Passed to pprint.pformat()
118

119
        Returns:
120
            str: Pretty formatted transcript representation
121
        """
122

123
    def format_transcripts(self, transcripts, **kwargs):
124
        """
125
        Pretty print multiple transcripts.
126

127
        Args:
128
            transcripts (List[FetchedTranscript]): Transcripts to format
129
            **kwargs: Passed to pprint.pformat()
130

131
        Returns:
132
            str: Pretty formatted list of transcripts
133
        """
134
```
135

136
### SRT Formatter
137

138
Creates SRT (SubRip) subtitle files compatible with video players and subtitle software.
139

140
```python { .api }
141
class SRTFormatter(Formatter):
142
    def format_transcript(self, transcript, **kwargs):
143
        """
144
        Convert transcript to SRT subtitle format.
145

146
        Args:
147
            transcript (FetchedTranscript): Transcript to format
148
            **kwargs: Unused
149

150
        Returns:
151
            str: SRT formatted subtitles with sequence numbers and timestamps
152
        """
153

154
    def format_transcripts(self, transcripts, **kwargs):
155
        """
156
        Convert multiple transcripts to SRT format.
157

158
        Args:
159
            transcripts (List[FetchedTranscript]): Transcripts to format
160
            **kwargs: Unused
161

162
        Returns:
163
            str: Combined SRT formatted subtitles
164
        """
165
```
166

167
### WebVTT Formatter
168

169
Creates WebVTT subtitle files for web video players and HTML5 video elements.
170

171
```python { .api }
172
class WebVTTFormatter(Formatter):
173
    def format_transcript(self, transcript, **kwargs):
174
        """
175
        Convert transcript to WebVTT subtitle format.
176

177
        Args:
178
            transcript (FetchedTranscript): Transcript to format
179
            **kwargs: Unused
180

181
        Returns:
182
            str: WebVTT formatted subtitles with WEBVTT header
183
        """
184

185
    def format_transcripts(self, transcripts, **kwargs):
186
        """
187
        Convert multiple transcripts to WebVTT format.
188

189
        Args:
190
            transcripts (List[FetchedTranscript]): Transcripts to format
191
            **kwargs: Unused
192

193
        Returns:
194
            str: Combined WebVTT formatted subtitles
195
        """
196
```
197

198
### Formatter Loader
199

200
Utility class for loading formatters by type string. Provides a convenient interface for dynamic formatter selection.
201

202
```python { .api }
203
class FormatterLoader:
204
    TYPES = {
205
        "json": JSONFormatter,
206
        "pretty": PrettyPrintFormatter,
207
        "text": TextFormatter,
208
        "webvtt": WebVTTFormatter,
209
        "srt": SRTFormatter,
210
    }
211

212
    def load(self, formatter_type="pretty"):
213
        """
214
        Load formatter by type string.
215

216
        Args:
217
            formatter_type (str): Formatter type name. Defaults to "pretty"
218

219
        Returns:
220
            Formatter: Formatter instance
221

222
        Raises:
223
            UnknownFormatterType: Invalid formatter type
224
        """
225

226
    class UnknownFormatterType(Exception):
227
        def __init__(self, formatter_type):
228
            """
229
            Exception for invalid formatter types.
230

231
            Args:
232
                formatter_type (str): The invalid formatter type
233
            """
234
```
235

236
## Usage Examples
237

238
### Basic Formatting
239

240
```python
241
from youtube_transcript_api import YouTubeTranscriptApi
242
from youtube_transcript_api.formatters import JSONFormatter, TextFormatter
243

244
api = YouTubeTranscriptApi()
245
transcript = api.fetch('dQw4w9WgXcQ')
246

247
# JSON format
248
json_formatter = JSONFormatter()
249
json_output = json_formatter.format_transcript(transcript)
250
print(json_output)
251

252
# Plain text format
253
text_formatter = TextFormatter()
254
text_output = text_formatter.format_transcript(transcript)
255
print(text_output)
256
```
257

258
### Subtitle File Creation
259

260
```python
261
from youtube_transcript_api import YouTubeTranscriptApi
262
from youtube_transcript_api.formatters import SRTFormatter, WebVTTFormatter
263

264
api = YouTubeTranscriptApi()
265
transcript = api.fetch('dQw4w9WgXcQ')
266

267
# Create SRT subtitle file
268
srt_formatter = SRTFormatter()
269
srt_content = srt_formatter.format_transcript(transcript)
270

271
with open('subtitles.srt', 'w', encoding='utf-8') as f:
272
    f.write(srt_content)
273

274
# Create WebVTT subtitle file
275
webvtt_formatter = WebVTTFormatter()
276
webvtt_content = webvtt_formatter.format_transcript(transcript)
277

278
with open('subtitles.vtt', 'w', encoding='utf-8') as f:
279
    f.write(webvtt_content)
280
```
281

282
### Using FormatterLoader
283

284
```python
285
from youtube_transcript_api import YouTubeTranscriptApi
286
from youtube_transcript_api.formatters import FormatterLoader
287

288
api = YouTubeTranscriptApi()
289
transcript = api.fetch('dQw4w9WgXcQ')
290

291
loader = FormatterLoader()
292

293
# Load different formatters dynamically
294
for format_type in ['json', 'text', 'srt', 'webvtt', 'pretty']:
295
    formatter = loader.load(format_type)
296
    output = formatter.format_transcript(transcript)
297
    print(f"=== {format_type.upper()} ===")
298
    print(output[:200] + "..." if len(output) > 200 else output)
299
    print()
300
```
301

302
### JSON Formatting with Options
303

304
```python
305
from youtube_transcript_api import YouTubeTranscriptApi
306
from youtube_transcript_api.formatters import JSONFormatter
307
import json
308

309
api = YouTubeTranscriptApi()
310
transcript = api.fetch('dQw4w9WgXcQ')
311

312
json_formatter = JSONFormatter()
313

314
# Pretty printed JSON
315
pretty_json = json_formatter.format_transcript(transcript, indent=2, ensure_ascii=False)
316
print(pretty_json)
317

318
# Compact JSON
319
compact_json = json_formatter.format_transcript(transcript, separators=(',', ':'))
320
print(compact_json)
321
```
322

323
### Multiple Transcripts
324

325
```python
326
from youtube_transcript_api import YouTubeTranscriptApi
327
from youtube_transcript_api.formatters import TextFormatter
328

329
api = YouTubeTranscriptApi()
330

331
# Get transcripts in different languages
332
video_ids = ['dQw4w9WgXcQ', 'jNQXAC9IVRw']
333
transcripts = []
334

335
for video_id in video_ids:
336
    try:
337
        transcript = api.fetch(video_id)
338
        transcripts.append(transcript)
339
    except Exception as e:
340
        print(f"Failed to fetch {video_id}: {e}")
341

342
# Format all transcripts together
343
if transcripts:
344
    text_formatter = TextFormatter()
345
    combined_text = text_formatter.format_transcripts(transcripts)
346
    print(combined_text)
347
```
348

349
## Types
350

351
```python { .api }
352
from typing import List
353
from youtube_transcript_api._transcripts import FetchedTranscript
354

355
# Formatter interface types
356
FormatterType = str  # One of: "json", "text", "pretty", "srt", "webvtt"
357
FormatterKwargs = dict  # Formatter-specific keyword arguments
358
```

Version

Tile

Files

formatters.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

formatters.mddocs/