0
# PyAV
1
2
PyAV is a comprehensive Python library that provides Pythonic bindings for the FFmpeg multimedia framework, enabling developers to work directly and precisely with media containers, streams, packets, codecs, and frames. It offers powerful capabilities for multimedia processing including audio and video encoding/decoding, format conversion, stream manipulation, and codec access, while managing the complex underlying FFmpeg details.
3
4
## Package Information
5
6
- **Package Name**: av
7
- **Language**: Python
8
- **Installation**: `pip install av`
9
10
## Core Imports
11
12
```python
13
import av
14
```
15
16
Open media files:
17
18
```python
19
container = av.open('path/to/file.mp4')
20
```
21
22
Access specific components:
23
24
```python
25
from av import AudioFrame, VideoFrame, Packet
26
from av.audio import AudioResampler
27
from av.video import VideoReformatter
28
```
29
30
## Basic Usage
31
32
```python
33
import av
34
import numpy as np
35
36
# Open an input container
37
container = av.open('input.mp4')
38
39
# Get the first video stream
40
video_stream = container.streams.video[0]
41
42
# Decode frames
43
for frame in container.decode(video_stream):
44
# Convert to numpy array for processing
45
array = frame.to_ndarray(format='rgb24')
46
print(f"Frame shape: {array.shape}")
47
48
# Process first few frames only
49
if frame.index > 10:
50
break
51
52
container.close()
53
54
# Create an output container
55
output = av.open('output.mp4', 'w')
56
57
# Add a video stream
58
video_stream = output.add_stream('h264', rate=30)
59
video_stream.width = 1920
60
video_stream.height = 1080
61
62
# Create and encode frames
63
for i in range(90): # 3 seconds at 30fps
64
# Create a frame from numpy array
65
frame = VideoFrame.from_ndarray(
66
np.random.randint(0, 255, (1080, 1920, 3), dtype=np.uint8),
67
format='rgb24'
68
)
69
frame.pts = i
70
frame.time_base = video_stream.time_base
71
72
# Encode and write
73
for packet in video_stream.encode(frame):
74
output.mux(packet)
75
76
# Flush and close
77
for packet in video_stream.encode():
78
output.mux(packet)
79
output.close()
80
```
81
82
## Architecture
83
84
PyAV's design follows FFmpeg's architecture with Python-friendly abstractions:
85
86
- **Containers**: Media file wrappers managing format, streams, and metadata
87
- **Streams**: Individual audio/video/subtitle tracks with codec information
88
- **Packets**: Compressed data units containing encoded media
89
- **Frames**: Uncompressed media data (audio samples or video pixels)
90
- **Codecs**: Encoder/decoder contexts for format conversion
91
- **Filters**: Processing graphs for audio/video transformation
92
93
This architecture enables direct access to FFmpeg's capabilities while providing Pythonic interfaces, NumPy integration, and automatic memory management.
94
95
## Capabilities
96
97
### Container Operations
98
99
Core functionality for opening, reading, and writing media files. Supports input containers for reading media and output containers for writing, with comprehensive format support and metadata handling.
100
101
```python { .api }
102
def open(file, mode='r', format=None, options=None, **kwargs):
103
"""
104
Open a media container.
105
106
Parameters:
107
- file: File path, file object, or URL
108
- mode: 'r' for reading, 'w' for writing
109
- format: Container format (auto-detected if None)
110
- options: Dict of container options
111
112
Returns:
113
InputContainer or OutputContainer
114
"""
115
116
class InputContainer:
117
streams: StreamContainer
118
metadata: dict[str, str]
119
duration: int
120
start_time: int
121
122
def demux(*streams): ...
123
def decode(*streams): ...
124
def seek(offset, **kwargs): ...
125
126
class OutputContainer:
127
def add_stream(codec, rate=None, **kwargs): ...
128
def mux(packet): ...
129
def close(): ...
130
```
131
132
[Container Operations](./containers.md)
133
134
### Audio Processing
135
136
Comprehensive audio handling including frames, streams, format conversion, resampling, and FIFO buffering. Supports all major audio formats with NumPy integration.
137
138
```python { .api }
139
class AudioFrame:
140
samples: int
141
sample_rate: int
142
format: AudioFormat
143
layout: AudioLayout
144
planes: tuple[AudioPlane, ...]
145
146
@staticmethod
147
def from_ndarray(array, format='s16', layout='stereo', sample_rate=48000): ...
148
def to_ndarray(format=None): ...
149
150
class AudioResampler:
151
def __init__(self, format=None, layout=None, rate=None): ...
152
def resample(frame): ...
153
154
class AudioStream:
155
def encode(frame=None): ...
156
def decode(packet): ...
157
```
158
159
[Audio Processing](./audio.md)
160
161
### Video Processing
162
163
Complete video handling with frames, streams, format conversion, reformatting, and image operations. Includes support for all major video formats and pixel formats with NumPy/PIL integration.
164
165
```python { .api }
166
class VideoFrame:
167
width: int
168
height: int
169
format: VideoFormat
170
planes: tuple[VideoPlane, ...]
171
pts: int
172
173
@staticmethod
174
def from_ndarray(array, format='rgb24'): ...
175
def to_ndarray(format=None): ...
176
def to_image(): ...
177
def reformat(width=None, height=None, format=None): ...
178
179
class VideoStream:
180
def encode(frame=None): ...
181
def decode(packet): ...
182
183
class VideoReformatter:
184
def reformat(frame, width=None, height=None, format=None): ...
185
```
186
187
[Video Processing](./video.md)
188
189
### Codec Management
190
191
Codec contexts for encoding and decoding with hardware acceleration support. Provides access to all FFmpeg codecs with comprehensive parameter control.
192
193
```python { .api }
194
class Codec:
195
name: str
196
type: str
197
is_encoder: bool
198
is_decoder: bool
199
200
def create(kind=None): ...
201
202
class CodecContext:
203
def open(codec=None): ...
204
def encode(frame=None): ...
205
def decode(packet): ...
206
def flush_buffers(): ...
207
208
class HWAccel:
209
@staticmethod
210
def create(device_type, device=None): ...
211
```
212
213
[Codec Management](./codecs.md)
214
215
### Filter System
216
217
Audio and video filtering capabilities using FFmpeg's filter system. Supports filter graphs, custom filters, and processing pipelines.
218
219
```python { .api }
220
class Graph:
221
def add(filter, *args, **kwargs): ...
222
def add_buffer(template): ...
223
def configure(): ...
224
def push(frame): ...
225
def pull(): ...
226
227
class Filter:
228
name: str
229
description: str
230
231
class FilterContext:
232
def link_to(target, output_idx=0, input_idx=0): ...
233
def push(frame): ...
234
def pull(): ...
235
```
236
237
[Filter System](./filters.md)
238
239
### Packet and Stream Management
240
241
Low-level packet handling and stream operations for precise control over media data flow and timing.
242
243
```python { .api }
244
class Packet:
245
stream: Stream
246
pts: int
247
dts: int
248
size: int
249
duration: int
250
is_keyframe: bool
251
252
def decode(): ...
253
254
class StreamContainer:
255
video: tuple[VideoStream, ...]
256
audio: tuple[AudioStream, ...]
257
258
def best(kind): ...
259
260
class Stream:
261
index: int
262
type: str
263
codec_context: CodecContext
264
metadata: dict[str, str]
265
```
266
267
[Packet and Stream Management](./streams.md)
268
269
## Global Functions and Constants
270
271
```python { .api }
272
# Version information
273
__version__: str
274
library_versions: dict[str, tuple[int, int, int]]
275
ffmpeg_version_info: str
276
277
# Core functions
278
def open(file, mode='r', **kwargs): ...
279
def get_include() -> str: ...
280
281
# Available formats and codecs
282
formats_available: set[str]
283
codecs_available: set[str]
284
bitstream_filters_available: set[str]
285
```
286
287
## Utility Modules
288
289
### Logging
290
291
PyAV provides comprehensive logging capabilities for debugging and monitoring FFmpeg operations.
292
293
```python { .api }
294
import av.logging
295
296
# Log levels
297
av.logging.PANIC: int
298
av.logging.FATAL: int
299
av.logging.ERROR: int
300
av.logging.WARNING: int
301
av.logging.INFO: int
302
av.logging.VERBOSE: int
303
av.logging.DEBUG: int
304
av.logging.TRACE: int
305
306
# Logging functions
307
def get_level() -> int: ...
308
def set_level(level: int) -> None: ...
309
def log(level: int, name: str, message: str) -> None: ...
310
311
class Capture:
312
"""Context manager for capturing log messages."""
313
logs: list[tuple[int, str, str]]
314
315
def __enter__(self): ...
316
def __exit__(self, *args): ...
317
```
318
319
### Test Data
320
321
Access to sample media files for testing and development.
322
323
```python { .api }
324
import av.datasets
325
326
def cached_download(url: str, name: str) -> str:
327
"""Download and cache test data."""
328
329
def fate(name: str) -> str:
330
"""Get FFmpeg test suite sample."""
331
332
def curated(name: str) -> str:
333
"""Get PyAV curated sample."""
334
```
335
336
## Error Handling
337
338
```python { .api }
339
class FFmpegError(Exception):
340
errno: int
341
strerror: str
342
filename: str
343
344
class InvalidDataError(FFmpegError, ValueError): ...
345
class HTTPError(FFmpegError): ...
346
class LookupError(FFmpegError): ...
347
348
# Specific lookup errors
349
class DecoderNotFoundError(LookupError): ...
350
class EncoderNotFoundError(LookupError): ...
351
class FilterNotFoundError(LookupError): ...
352
```