A streaming multipart parser for Python that enables efficient handling of file uploads and form data in web applications
npx @tessl/cli install tessl/pypi-python-multipart@0.0.00
# Python-Multipart
1
2
A streaming multipart parser for Python that provides comprehensive parsing capabilities for multipart/form-data, application/x-www-form-urlencoded, and application/octet-stream content types. Enables efficient handling of file uploads and form data in web applications without loading entire payloads into memory.
3
4
## Package Information
5
6
- **Package Name**: python-multipart
7
- **Language**: Python
8
- **Installation**: `pip install python-multipart`
9
- **License**: Apache-2.0
10
- **Test Coverage**: 100%
11
12
## Core Imports
13
14
```python
15
import python_multipart
16
```
17
18
Common imports for specific parser classes:
19
20
```python
21
from python_multipart import (
22
FormParser,
23
MultipartParser,
24
QuerystringParser,
25
OctetStreamParser,
26
parse_form,
27
create_form_parser
28
)
29
```
30
31
Legacy import (deprecated but still supported):
32
33
```python
34
import multipart # Shows deprecation warning
35
```
36
37
## Basic Usage
38
39
```python
40
import python_multipart
41
42
def simple_wsgi_app(environ, start_response):
43
# Simple form parsing with callbacks
44
def on_field(field):
45
print(f"Field: {field.field_name} = {field.value}")
46
47
def on_file(file):
48
print(f"File: {file.field_name}, size: {file.size}")
49
file.close()
50
51
# Parse form data from WSGI environ
52
headers = {'Content-Type': environ['CONTENT_TYPE']}
53
python_multipart.parse_form(
54
headers,
55
environ['wsgi.input'],
56
on_field,
57
on_file
58
)
59
60
start_response('200 OK', [('Content-Type', 'text/plain')])
61
return [b'Form parsed successfully']
62
63
# Direct parser usage for streaming large files
64
from python_multipart import MultipartParser
65
66
def handle_upload(boundary, input_stream):
67
def on_part_data(data, start, end):
68
# Process data chunk without loading entire file
69
chunk = data[start:end]
70
process_chunk(chunk)
71
72
callbacks = {'on_part_data': on_part_data}
73
parser = MultipartParser(boundary, callbacks)
74
75
# Stream data in chunks
76
while True:
77
chunk = input_stream.read(8192)
78
if not chunk:
79
break
80
parser.write(chunk)
81
82
parser.finalize()
83
```
84
85
## Architecture
86
87
Python-multipart uses a streaming, callback-based architecture that enables memory-efficient processing:
88
89
- **Parser Layer**: Low-level streaming parsers (MultipartParser, QuerystringParser, OctetStreamParser) that process data incrementally
90
- **Data Layer**: Field and File objects that handle data storage with configurable memory/disk thresholds
91
- **High-Level Interface**: FormParser and convenience functions that auto-detect content types and manage parser lifecycle
92
- **Decoder Layer**: Base64Decoder and QuotedPrintableDecoder for handling encoded content
93
- **Error Handling**: Comprehensive exception hierarchy for robust error handling
94
95
This design allows processing arbitrarily large uploads without memory constraints while providing both low-level control and high-level convenience.
96
97
## Capabilities
98
99
### High-Level Form Parsing
100
101
Complete form parsing solution that automatically detects content types and creates appropriate parsers. Handles multipart/form-data, application/x-www-form-urlencoded, and application/octet-stream with Field and File object creation.
102
103
```python { .api }
104
def parse_form(
105
headers: dict[str, bytes],
106
input_stream,
107
on_field,
108
on_file,
109
chunk_size: int = 1048576
110
) -> None: ...
111
112
def create_form_parser(
113
headers: dict[str, bytes],
114
on_field,
115
on_file,
116
trust_x_headers: bool = False,
117
config: dict = {}
118
) -> FormParser: ...
119
120
class FormParser:
121
def __init__(
122
self,
123
content_type: str,
124
on_field: OnFieldCallback | None,
125
on_file: OnFileCallback | None,
126
on_end: Callable[[], None] | None = None,
127
boundary: bytes | str | None = None,
128
file_name: bytes | None = None,
129
FileClass: type[FileProtocol] = File,
130
FieldClass: type[FieldProtocol] = Field,
131
config: dict = {}
132
): ...
133
def write(self, data: bytes) -> int: ...
134
def finalize(self) -> None: ...
135
def close(self) -> None: ...
136
```
137
138
[High-Level Form Parsing](./form-parsing.md)
139
140
### Base Parser and Streaming Parsers
141
142
Base class and low-level streaming parsers for specific content types with callback-based processing. BaseParser provides common functionality, while specialized parsers provide fine-grained control over parsing behavior.
143
144
```python { .api }
145
class BaseParser:
146
def __init__(self): ...
147
def callback(
148
self,
149
name: str,
150
data: bytes | None = None,
151
start: int | None = None,
152
end: int | None = None
153
) -> None: ...
154
def set_callback(self, name: str, new_func) -> None: ...
155
def close(self) -> None: ...
156
def finalize(self) -> None: ...
157
158
class MultipartParser(BaseParser):
159
def __init__(
160
self,
161
boundary: bytes | str,
162
callbacks: dict = {},
163
max_size: float = float("inf")
164
): ...
165
def write(self, data: bytes) -> int: ...
166
167
class QuerystringParser(BaseParser):
168
def __init__(
169
self,
170
callbacks: dict = {},
171
strict_parsing: bool = False,
172
max_size: float = float("inf")
173
): ...
174
175
class OctetStreamParser(BaseParser):
176
def __init__(
177
self,
178
callbacks: dict = {},
179
max_size: float = float("inf")
180
): ...
181
```
182
183
[Base Parser and Streaming Parsers](./streaming-parsers.md)
184
185
### Data Objects
186
187
Field and File objects for handling parsed form data with configurable storage options. Files support automatic memory-to-disk spillover based on size thresholds.
188
189
```python { .api }
190
class Field:
191
def __init__(self, name: bytes | None): ...
192
@classmethod
193
def from_value(cls, name: bytes, value: bytes | None) -> Field: ...
194
field_name: bytes | None
195
value: bytes | None
196
197
class File:
198
def __init__(
199
self,
200
file_name: bytes | None,
201
field_name: bytes | None = None,
202
config: dict = {}
203
): ...
204
field_name: bytes | None
205
file_name: bytes | None
206
actual_file_name: bytes | None
207
file_object: BytesIO | BufferedRandom
208
size: int
209
in_memory: bool
210
```
211
212
[Data Objects](./data-objects.md)
213
214
### Content Decoders
215
216
Streaming decoders for Base64 and quoted-printable encoded content with automatic caching for incomplete chunks.
217
218
```python { .api }
219
class Base64Decoder:
220
def __init__(self, underlying): ...
221
def write(self, data: bytes) -> int: ...
222
def finalize(self) -> None: ...
223
224
class QuotedPrintableDecoder:
225
def __init__(self, underlying): ...
226
def write(self, data: bytes) -> int: ...
227
def finalize(self) -> None: ...
228
```
229
230
[Content Decoders](./decoders.md)
231
232
### Exception Handling
233
234
Comprehensive exception hierarchy for robust error handling across all parsing operations.
235
236
```python { .api }
237
class FormParserError(ValueError): ...
238
class ParseError(FormParserError):
239
offset: int = -1
240
class MultipartParseError(ParseError): ...
241
class QuerystringParseError(ParseError): ...
242
class DecodeError(ParseError): ...
243
class FileError(FormParserError, OSError): ...
244
```
245
246
[Exception Handling](./exceptions.md)
247
248
## Utility Functions
249
250
```python { .api }
251
def parse_options_header(value: str | bytes | None) -> tuple[bytes, dict[bytes, bytes]]: ...
252
```
253
254
Parses Content-Type headers into (content_type, parameters) format for boundary extraction and content type detection.
255
256
**Import:**
257
```python
258
from python_multipart.multipart import parse_options_header
259
```
260
261
[Utility Functions](./streaming-parsers.md#utility-functions)
262
263
## Types
264
265
```python { .api }
266
# State enums for parser tracking
267
class QuerystringState(IntEnum):
268
BEFORE_FIELD = 0
269
FIELD_NAME = 1
270
FIELD_DATA = 2
271
272
class MultipartState(IntEnum):
273
START = 0
274
START_BOUNDARY = 1
275
HEADER_FIELD_START = 2
276
HEADER_FIELD = 3
277
HEADER_VALUE_START = 4
278
HEADER_VALUE = 5
279
HEADER_VALUE_ALMOST_DONE = 6
280
HEADERS_ALMOST_DONE = 7
281
PART_DATA_START = 8
282
PART_DATA = 9
283
PART_DATA_END = 10
284
END_BOUNDARY = 11
285
END = 12
286
287
# Configuration types
288
class FormParserConfig(TypedDict):
289
UPLOAD_DIR: str | None
290
UPLOAD_KEEP_FILENAME: bool
291
UPLOAD_KEEP_EXTENSIONS: bool
292
UPLOAD_ERROR_ON_BAD_CTE: bool
293
MAX_MEMORY_FILE_SIZE: int
294
MAX_BODY_SIZE: float
295
296
class FileConfig(TypedDict, total=False):
297
UPLOAD_DIR: str | bytes | None
298
UPLOAD_DELETE_TMP: bool
299
UPLOAD_KEEP_FILENAME: bool
300
UPLOAD_KEEP_EXTENSIONS: bool
301
MAX_MEMORY_FILE_SIZE: int
302
303
class QuerystringCallbacks(TypedDict, total=False):
304
on_field_start: Callable[[], None]
305
on_field_name: Callable[[bytes, int, int], None]
306
on_field_data: Callable[[bytes, int, int], None]
307
on_field_end: Callable[[], None]
308
on_end: Callable[[], None]
309
310
class OctetStreamCallbacks(TypedDict, total=False):
311
on_start: Callable[[], None]
312
on_data: Callable[[bytes, int, int], None]
313
on_end: Callable[[], None]
314
315
class MultipartCallbacks(TypedDict, total=False):
316
on_part_begin: Callable[[], None]
317
on_part_data: Callable[[bytes, int, int], None]
318
on_part_end: Callable[[], None]
319
on_header_begin: Callable[[], None]
320
on_header_field: Callable[[bytes, int, int], None]
321
on_header_value: Callable[[bytes, int, int], None]
322
on_header_end: Callable[[], None]
323
on_headers_finished: Callable[[], None]
324
on_end: Callable[[], None]
325
326
# Protocol types
327
class SupportsRead(Protocol):
328
def read(self, __n: int) -> bytes: ...
329
330
class SupportsWrite(Protocol):
331
def write(self, __b: bytes) -> object: ...
332
333
class _FormProtocol(Protocol):
334
def write(self, data: bytes) -> int: ...
335
def finalize(self) -> None: ...
336
def close(self) -> None: ...
337
338
class FieldProtocol(Protocol):
339
def __init__(self, name: bytes | None) -> None: ...
340
def write(self, data: bytes) -> int: ...
341
def finalize(self) -> None: ...
342
def close(self) -> None: ...
343
def set_none(self) -> None: ...
344
345
class FileProtocol(Protocol):
346
def __init__(self, file_name: bytes | None, field_name: bytes | None, config: dict) -> None: ...
347
def write(self, data: bytes) -> int: ...
348
def finalize(self) -> None: ...
349
def close(self) -> None: ...
350
351
# Callback type aliases
352
OnFieldCallback = Callable[[FieldProtocol], None]
353
OnFileCallback = Callable[[FileProtocol], None]
354
355
CallbackName = Literal[
356
"start",
357
"data",
358
"end",
359
"field_start",
360
"field_name",
361
"field_data",
362
"field_end",
363
"part_begin",
364
"part_data",
365
"part_end",
366
"header_begin",
367
"header_field",
368
"header_value",
369
"header_end",
370
"headers_finished",
371
]
372
```