or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

data-objects.mddecoders.mdexceptions.mdform-parsing.mdindex.mdstreaming-parsers.md

index.mddocs/

0

# Python-Multipart

1

2

A streaming multipart parser for Python that provides comprehensive parsing capabilities for multipart/form-data, application/x-www-form-urlencoded, and application/octet-stream content types. Enables efficient handling of file uploads and form data in web applications without loading entire payloads into memory.

3

4

## Package Information

5

6

- **Package Name**: python-multipart

7

- **Language**: Python

8

- **Installation**: `pip install python-multipart`

9

- **License**: Apache-2.0

10

- **Test Coverage**: 100%

11

12

## Core Imports

13

14

```python

15

import python_multipart

16

```

17

18

Common imports for specific parser classes:

19

20

```python

21

from python_multipart import (

22

FormParser,

23

MultipartParser,

24

QuerystringParser,

25

OctetStreamParser,

26

parse_form,

27

create_form_parser

28

)

29

```

30

31

Legacy import (deprecated but still supported):

32

33

```python

34

import multipart # Shows deprecation warning

35

```

36

37

## Basic Usage

38

39

```python

40

import python_multipart

41

42

def simple_wsgi_app(environ, start_response):

43

# Simple form parsing with callbacks

44

def on_field(field):

45

print(f"Field: {field.field_name} = {field.value}")

46

47

def on_file(file):

48

print(f"File: {file.field_name}, size: {file.size}")

49

file.close()

50

51

# Parse form data from WSGI environ

52

headers = {'Content-Type': environ['CONTENT_TYPE']}

53

python_multipart.parse_form(

54

headers,

55

environ['wsgi.input'],

56

on_field,

57

on_file

58

)

59

60

start_response('200 OK', [('Content-Type', 'text/plain')])

61

return [b'Form parsed successfully']

62

63

# Direct parser usage for streaming large files

64

from python_multipart import MultipartParser

65

66

def handle_upload(boundary, input_stream):

67

def on_part_data(data, start, end):

68

# Process data chunk without loading entire file

69

chunk = data[start:end]

70

process_chunk(chunk)

71

72

callbacks = {'on_part_data': on_part_data}

73

parser = MultipartParser(boundary, callbacks)

74

75

# Stream data in chunks

76

while True:

77

chunk = input_stream.read(8192)

78

if not chunk:

79

break

80

parser.write(chunk)

81

82

parser.finalize()

83

```

84

85

## Architecture

86

87

Python-multipart uses a streaming, callback-based architecture that enables memory-efficient processing:

88

89

- **Parser Layer**: Low-level streaming parsers (MultipartParser, QuerystringParser, OctetStreamParser) that process data incrementally

90

- **Data Layer**: Field and File objects that handle data storage with configurable memory/disk thresholds

91

- **High-Level Interface**: FormParser and convenience functions that auto-detect content types and manage parser lifecycle

92

- **Decoder Layer**: Base64Decoder and QuotedPrintableDecoder for handling encoded content

93

- **Error Handling**: Comprehensive exception hierarchy for robust error handling

94

95

This design allows processing arbitrarily large uploads without memory constraints while providing both low-level control and high-level convenience.

96

97

## Capabilities

98

99

### High-Level Form Parsing

100

101

Complete form parsing solution that automatically detects content types and creates appropriate parsers. Handles multipart/form-data, application/x-www-form-urlencoded, and application/octet-stream with Field and File object creation.

102

103

```python { .api }

104

def parse_form(

105

headers: dict[str, bytes],

106

input_stream,

107

on_field,

108

on_file,

109

chunk_size: int = 1048576

110

) -> None: ...

111

112

def create_form_parser(

113

headers: dict[str, bytes],

114

on_field,

115

on_file,

116

trust_x_headers: bool = False,

117

config: dict = {}

118

) -> FormParser: ...

119

120

class FormParser:

121

def __init__(

122

self,

123

content_type: str,

124

on_field: OnFieldCallback | None,

125

on_file: OnFileCallback | None,

126

on_end: Callable[[], None] | None = None,

127

boundary: bytes | str | None = None,

128

file_name: bytes | None = None,

129

FileClass: type[FileProtocol] = File,

130

FieldClass: type[FieldProtocol] = Field,

131

config: dict = {}

132

): ...

133

def write(self, data: bytes) -> int: ...

134

def finalize(self) -> None: ...

135

def close(self) -> None: ...

136

```

137

138

[High-Level Form Parsing](./form-parsing.md)

139

140

### Base Parser and Streaming Parsers

141

142

Base class and low-level streaming parsers for specific content types with callback-based processing. BaseParser provides common functionality, while specialized parsers provide fine-grained control over parsing behavior.

143

144

```python { .api }

145

class BaseParser:

146

def __init__(self): ...

147

def callback(

148

self,

149

name: str,

150

data: bytes | None = None,

151

start: int | None = None,

152

end: int | None = None

153

) -> None: ...

154

def set_callback(self, name: str, new_func) -> None: ...

155

def close(self) -> None: ...

156

def finalize(self) -> None: ...

157

158

class MultipartParser(BaseParser):

159

def __init__(

160

self,

161

boundary: bytes | str,

162

callbacks: dict = {},

163

max_size: float = float("inf")

164

): ...

165

def write(self, data: bytes) -> int: ...

166

167

class QuerystringParser(BaseParser):

168

def __init__(

169

self,

170

callbacks: dict = {},

171

strict_parsing: bool = False,

172

max_size: float = float("inf")

173

): ...

174

175

class OctetStreamParser(BaseParser):

176

def __init__(

177

self,

178

callbacks: dict = {},

179

max_size: float = float("inf")

180

): ...

181

```

182

183

[Base Parser and Streaming Parsers](./streaming-parsers.md)

184

185

### Data Objects

186

187

Field and File objects for handling parsed form data with configurable storage options. Files support automatic memory-to-disk spillover based on size thresholds.

188

189

```python { .api }

190

class Field:

191

def __init__(self, name: bytes | None): ...

192

@classmethod

193

def from_value(cls, name: bytes, value: bytes | None) -> Field: ...

194

field_name: bytes | None

195

value: bytes | None

196

197

class File:

198

def __init__(

199

self,

200

file_name: bytes | None,

201

field_name: bytes | None = None,

202

config: dict = {}

203

): ...

204

field_name: bytes | None

205

file_name: bytes | None

206

actual_file_name: bytes | None

207

file_object: BytesIO | BufferedRandom

208

size: int

209

in_memory: bool

210

```

211

212

[Data Objects](./data-objects.md)

213

214

### Content Decoders

215

216

Streaming decoders for Base64 and quoted-printable encoded content with automatic caching for incomplete chunks.

217

218

```python { .api }

219

class Base64Decoder:

220

def __init__(self, underlying): ...

221

def write(self, data: bytes) -> int: ...

222

def finalize(self) -> None: ...

223

224

class QuotedPrintableDecoder:

225

def __init__(self, underlying): ...

226

def write(self, data: bytes) -> int: ...

227

def finalize(self) -> None: ...

228

```

229

230

[Content Decoders](./decoders.md)

231

232

### Exception Handling

233

234

Comprehensive exception hierarchy for robust error handling across all parsing operations.

235

236

```python { .api }

237

class FormParserError(ValueError): ...

238

class ParseError(FormParserError):

239

offset: int = -1

240

class MultipartParseError(ParseError): ...

241

class QuerystringParseError(ParseError): ...

242

class DecodeError(ParseError): ...

243

class FileError(FormParserError, OSError): ...

244

```

245

246

[Exception Handling](./exceptions.md)

247

248

## Utility Functions

249

250

```python { .api }

251

def parse_options_header(value: str | bytes | None) -> tuple[bytes, dict[bytes, bytes]]: ...

252

```

253

254

Parses Content-Type headers into (content_type, parameters) format for boundary extraction and content type detection.

255

256

**Import:**

257

```python

258

from python_multipart.multipart import parse_options_header

259

```

260

261

[Utility Functions](./streaming-parsers.md#utility-functions)

262

263

## Types

264

265

```python { .api }

266

# State enums for parser tracking

267

class QuerystringState(IntEnum):

268

BEFORE_FIELD = 0

269

FIELD_NAME = 1

270

FIELD_DATA = 2

271

272

class MultipartState(IntEnum):

273

START = 0

274

START_BOUNDARY = 1

275

HEADER_FIELD_START = 2

276

HEADER_FIELD = 3

277

HEADER_VALUE_START = 4

278

HEADER_VALUE = 5

279

HEADER_VALUE_ALMOST_DONE = 6

280

HEADERS_ALMOST_DONE = 7

281

PART_DATA_START = 8

282

PART_DATA = 9

283

PART_DATA_END = 10

284

END_BOUNDARY = 11

285

END = 12

286

287

# Configuration types

288

class FormParserConfig(TypedDict):

289

UPLOAD_DIR: str | None

290

UPLOAD_KEEP_FILENAME: bool

291

UPLOAD_KEEP_EXTENSIONS: bool

292

UPLOAD_ERROR_ON_BAD_CTE: bool

293

MAX_MEMORY_FILE_SIZE: int

294

MAX_BODY_SIZE: float

295

296

class FileConfig(TypedDict, total=False):

297

UPLOAD_DIR: str | bytes | None

298

UPLOAD_DELETE_TMP: bool

299

UPLOAD_KEEP_FILENAME: bool

300

UPLOAD_KEEP_EXTENSIONS: bool

301

MAX_MEMORY_FILE_SIZE: int

302

303

class QuerystringCallbacks(TypedDict, total=False):

304

on_field_start: Callable[[], None]

305

on_field_name: Callable[[bytes, int, int], None]

306

on_field_data: Callable[[bytes, int, int], None]

307

on_field_end: Callable[[], None]

308

on_end: Callable[[], None]

309

310

class OctetStreamCallbacks(TypedDict, total=False):

311

on_start: Callable[[], None]

312

on_data: Callable[[bytes, int, int], None]

313

on_end: Callable[[], None]

314

315

class MultipartCallbacks(TypedDict, total=False):

316

on_part_begin: Callable[[], None]

317

on_part_data: Callable[[bytes, int, int], None]

318

on_part_end: Callable[[], None]

319

on_header_begin: Callable[[], None]

320

on_header_field: Callable[[bytes, int, int], None]

321

on_header_value: Callable[[bytes, int, int], None]

322

on_header_end: Callable[[], None]

323

on_headers_finished: Callable[[], None]

324

on_end: Callable[[], None]

325

326

# Protocol types

327

class SupportsRead(Protocol):

328

def read(self, __n: int) -> bytes: ...

329

330

class SupportsWrite(Protocol):

331

def write(self, __b: bytes) -> object: ...

332

333

class _FormProtocol(Protocol):

334

def write(self, data: bytes) -> int: ...

335

def finalize(self) -> None: ...

336

def close(self) -> None: ...

337

338

class FieldProtocol(Protocol):

339

def __init__(self, name: bytes | None) -> None: ...

340

def write(self, data: bytes) -> int: ...

341

def finalize(self) -> None: ...

342

def close(self) -> None: ...

343

def set_none(self) -> None: ...

344

345

class FileProtocol(Protocol):

346

def __init__(self, file_name: bytes | None, field_name: bytes | None, config: dict) -> None: ...

347

def write(self, data: bytes) -> int: ...

348

def finalize(self) -> None: ...

349

def close(self) -> None: ...

350

351

# Callback type aliases

352

OnFieldCallback = Callable[[FieldProtocol], None]

353

OnFileCallback = Callable[[FileProtocol], None]

354

355

CallbackName = Literal[

356

"start",

357

"data",

358

"end",

359

"field_start",

360

"field_name",

361

"field_data",

362

"field_end",

363

"part_begin",

364

"part_data",

365

"part_end",

366

"header_begin",

367

"header_field",

368

"header_value",

369

"header_end",

370

"headers_finished",

371

]

372

```