or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

assistants.mdaudio.mdbatches.mdchat-completions.mdchatkit.mdclient-initialization.mdcompletions.mdcontainers.mdconversations.mdembeddings.mdevals.mdfiles.mdfine-tuning.mdimages.mdindex.mdmodels.mdmoderations.mdrealtime.mdresponses.mdruns.mdthreads-messages.mduploads.mdvector-stores.mdvideos.mdwebhooks.md
KNOWN_ISSUES.md

uploads.mddocs/

0

# Uploads

1

2

Upload large files in chunks for use with Assistants, Fine-tuning, and Batch processing. The Uploads API enables efficient multipart upload of files up to 8 GB, splitting them into 64 MB parts that can be uploaded in parallel.

3

4

## Capabilities

5

6

### Create Upload

7

8

Create an intermediate Upload object that accepts multiple parts.

9

10

```python { .api }

11

def create(

12

self,

13

*,

14

bytes: int,

15

filename: str,

16

mime_type: str,

17

purpose: FilePurpose,

18

expires_after: dict | Omit = omit,

19

extra_headers: dict[str, str] | None = None,

20

extra_query: dict[str, object] | None = None,

21

extra_body: dict[str, object] | None = None,

22

timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,

23

) -> Upload:

24

"""

25

Create an intermediate Upload object for adding file parts.

26

27

Args:

28

bytes: Total number of bytes in the file being uploaded.

29

30

filename: Name of the file to upload.

31

32

mime_type: MIME type of the file. Must be supported for the specified purpose.

33

See https://platform.openai.com/docs/assistants/tools/file-search#supported-files

34

35

purpose: Intended purpose of the file. Options:

36

- "assistants": For use with Assistants API

37

- "batch": For batch processing

38

- "fine-tune": For fine-tuning

39

- "vision": For vision capabilities

40

See https://platform.openai.com/docs/api-reference/files/create#files-create-purpose

41

42

expires_after: Expiration policy for the file. Default: files with purpose=batch

43

expire after 30 days, others persist until manually deleted.

44

{"anchor": "created_at", "days": 7} expires 7 days after creation.

45

46

extra_headers: Additional HTTP headers.

47

extra_query: Additional query parameters.

48

extra_body: Additional JSON fields.

49

timeout: Request timeout in seconds.

50

51

Returns:

52

Upload: Upload object with ID to use for adding parts.

53

Contains status, expires_at, and other metadata.

54

55

Notes:

56

- Maximum upload size: 8 GB

57

- Upload expires 1 hour after creation

58

- Must complete upload before expiration

59

- Each part can be at most 64 MB

60

"""

61

```

62

63

### Complete Upload

64

65

Finalize the upload after all parts have been added.

66

67

```python { .api }

68

def complete(

69

self,

70

upload_id: str,

71

*,

72

part_ids: list[str],

73

md5: str | Omit = omit,

74

extra_headers: dict[str, str] | None = None,

75

extra_query: dict[str, object] | None = None,

76

extra_body: dict[str, object] | None = None,

77

timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,

78

) -> Upload:

79

"""

80

Complete the Upload and create a File object.

81

82

Args:

83

upload_id: ID of the Upload to complete.

84

85

part_ids: Ordered list of Part IDs. Order determines how parts are assembled.

86

87

md5: Optional MD5 checksum to verify uploaded bytes match expectations.

88

89

extra_headers: Additional HTTP headers.

90

extra_query: Additional query parameters.

91

extra_body: Additional JSON fields.

92

timeout: Request timeout in seconds.

93

94

Returns:

95

Upload: Completed Upload object containing a nested File object

96

ready for use in the rest of the platform.

97

98

Notes:

99

- Total bytes uploaded must match bytes specified in create()

100

- No parts can be added after completion

101

- Upload must not be cancelled or expired

102

"""

103

```

104

105

### Cancel Upload

106

107

Cancel an upload that is no longer needed.

108

109

```python { .api }

110

def cancel(

111

self,

112

upload_id: str,

113

*,

114

extra_headers: dict[str, str] | None = None,

115

extra_query: dict[str, object] | None = None,

116

extra_body: dict[str, object] | None = None,

117

timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,

118

) -> Upload:

119

"""

120

Cancel an Upload.

121

122

Args:

123

upload_id: ID of the Upload to cancel.

124

125

extra_headers: Additional HTTP headers.

126

extra_query: Additional query parameters.

127

extra_body: Additional JSON fields.

128

timeout: Request timeout in seconds.

129

130

Returns:

131

Upload: Cancelled Upload object with status="cancelled".

132

133

Notes:

134

- No parts can be added after cancellation

135

- Previously uploaded parts are discarded

136

"""

137

```

138

139

### Upload File Chunked

140

141

High-level helper that handles the entire upload process automatically.

142

143

```python { .api }

144

def upload_file_chunked(

145

self,

146

*,

147

file: str | os.PathLike | bytes,

148

mime_type: str,

149

purpose: FilePurpose,

150

filename: str | None = None,

151

bytes: int | None = None,

152

part_size: int | None = None,

153

md5: str | Omit = omit,

154

) -> Upload:

155

"""

156

Upload a file in chunks automatically.

157

158

This convenience method handles:

159

1. Creating the Upload

160

2. Splitting file into parts

161

3. Uploading each part sequentially

162

4. Completing the Upload

163

164

Args:

165

file: File to upload. Can be:

166

- Path-like object: Path("my-paper.pdf")

167

- String path: "my-paper.pdf"

168

- bytes: In-memory file data (requires filename and bytes args)

169

170

mime_type: MIME type of the file (e.g., "application/pdf").

171

172

purpose: Intended purpose ("assistants", "batch", "fine-tune", "vision").

173

174

filename: Filename (required if file is bytes, optional otherwise).

175

176

bytes: Total file size in bytes (required if file is bytes, optional otherwise).

177

If not provided for path, automatically determined from file.

178

179

part_size: Size of each part in bytes. Default: 64 MB (64 * 1024 * 1024).

180

Each part uploads as a separate request.

181

182

md5: Optional MD5 checksum for verification.

183

184

Returns:

185

Upload: Completed Upload object containing the File.

186

187

Raises:

188

TypeError: If filename or bytes not provided for in-memory files.

189

ValueError: If file path is invalid or file cannot be read.

190

"""

191

```

192

193

Usage examples:

194

195

```python

196

from pathlib import Path

197

from openai import OpenAI

198

199

client = OpenAI()

200

201

# Upload a file from disk (simplest approach)

202

upload = client.uploads.upload_file_chunked(

203

file=Path("training_data.jsonl"),

204

mime_type="application/jsonl",

205

purpose="fine-tune"

206

)

207

208

print(f"Upload complete! File ID: {upload.file.id}")

209

210

# Upload with custom part size (e.g., 32 MB parts)

211

upload = client.uploads.upload_file_chunked(

212

file="large_dataset.jsonl",

213

mime_type="application/jsonl",

214

purpose="batch",

215

part_size=32 * 1024 * 1024

216

)

217

218

# Upload in-memory bytes

219

file_data = b"..." # Your file data

220

upload = client.uploads.upload_file_chunked(

221

file=file_data,

222

filename="document.pdf",

223

bytes=len(file_data),

224

mime_type="application/pdf",

225

purpose="assistants"

226

)

227

228

# Upload with MD5 verification

229

upload = client.uploads.upload_file_chunked(

230

file="important_data.csv",

231

mime_type="text/csv",

232

purpose="assistants",

233

md5="5d41402abc4b2a76b9719d911017c592"

234

)

235

```

236

237

### Create Part

238

239

Add a single part to an Upload.

240

241

```python { .api }

242

def create(

243

self,

244

upload_id: str,

245

*,

246

data: FileTypes,

247

extra_headers: dict[str, str] | None = None,

248

extra_query: dict[str, object] | None = None,

249

extra_body: dict[str, object] | None = None,

250

timeout: float | httpx.Timeout | None | NotGiven = NOT_GIVEN,

251

) -> UploadPart:

252

"""

253

Add a Part to an Upload.

254

255

Args:

256

upload_id: ID of the Upload to add this Part to.

257

258

data: Chunk of bytes for this Part. Maximum 64 MB.

259

260

extra_headers: Additional HTTP headers.

261

extra_query: Additional query parameters.

262

extra_body: Additional JSON fields.

263

timeout: Request timeout in seconds.

264

265

Returns:

266

UploadPart: Part object with ID to use when completing the Upload.

267

268

Notes:

269

- Each Part can be at most 64 MB

270

- Total size across all parts cannot exceed 8 GB

271

- Parts can be added in parallel for faster uploads

272

- Order is determined when completing the Upload

273

"""

274

```

275

276

Advanced manual upload example:

277

278

```python

279

from openai import OpenAI

280

import io

281

282

client = OpenAI()

283

284

# Step 1: Create the Upload

285

file_path = "large_file.pdf"

286

file_size = os.path.getsize(file_path)

287

288

upload = client.uploads.create(

289

bytes=file_size,

290

filename="large_file.pdf",

291

mime_type="application/pdf",

292

purpose="assistants"

293

)

294

295

print(f"Created upload: {upload.id}")

296

297

# Step 2: Upload parts

298

part_size = 64 * 1024 * 1024 # 64 MB

299

part_ids = []

300

301

with open(file_path, "rb") as f:

302

while True:

303

chunk = f.read(part_size)

304

if not chunk:

305

break

306

307

part = client.uploads.parts.create(

308

upload_id=upload.id,

309

data=chunk

310

)

311

part_ids.append(part.id)

312

print(f"Uploaded part {len(part_ids)}: {part.id}")

313

314

# Step 3: Complete the Upload

315

completed = client.uploads.complete(

316

upload_id=upload.id,

317

part_ids=part_ids

318

)

319

320

print(f"Upload complete! File ID: {completed.file.id}")

321

322

# Handle errors by cancelling

323

try:

324

# ... upload process ...

325

pass

326

except Exception as e:

327

print(f"Error during upload: {e}")

328

client.uploads.cancel(upload_id=upload.id)

329

print("Upload cancelled")

330

```

331

332

Parallel upload example:

333

334

```python

335

from openai import OpenAI

336

from concurrent.futures import ThreadPoolExecutor

337

import io

338

339

client = OpenAI()

340

341

def upload_part(upload_id: str, part_data: bytes) -> str:

342

"""Upload a single part and return its ID."""

343

part = client.uploads.parts.create(

344

upload_id=upload_id,

345

data=part_data

346

)

347

return part.id

348

349

# Create upload

350

file_path = "large_file.pdf"

351

file_size = os.path.getsize(file_path)

352

353

upload = client.uploads.create(

354

bytes=file_size,

355

filename="large_file.pdf",

356

mime_type="application/pdf",

357

purpose="assistants"

358

)

359

360

# Split file into chunks

361

part_size = 64 * 1024 * 1024

362

chunks = []

363

364

with open(file_path, "rb") as f:

365

while True:

366

chunk = f.read(part_size)

367

if not chunk:

368

break

369

chunks.append(chunk)

370

371

# Upload parts in parallel

372

with ThreadPoolExecutor(max_workers=4) as executor:

373

part_ids = list(executor.map(

374

lambda chunk: upload_part(upload.id, chunk),

375

chunks

376

))

377

378

# Complete upload

379

completed = client.uploads.complete(

380

upload_id=upload.id,

381

part_ids=part_ids

382

)

383

384

print(f"Parallel upload complete! File ID: {completed.file.id}")

385

```

386

387

## Async Usage

388

389

```python

390

import asyncio

391

from openai import AsyncOpenAI

392

393

async def upload_file():

394

client = AsyncOpenAI()

395

396

# Async upload

397

upload = await client.uploads.upload_file_chunked(

398

file="data.jsonl",

399

mime_type="application/jsonl",

400

purpose="fine-tune"

401

)

402

403

return upload.file.id

404

405

file_id = asyncio.run(upload_file())

406

```

407

408

## Types

409

410

```python { .api }

411

from typing import Literal

412

from pydantic import BaseModel

413

414

class Upload(BaseModel):

415

"""Upload object containing metadata and status."""

416

id: str

417

bytes: int

418

created_at: int

419

expires_at: int

420

filename: str

421

object: Literal["upload"]

422

purpose: FilePurpose

423

status: Literal["pending", "completed", "cancelled", "expired"]

424

file: FileObject | None # Present when status="completed"

425

426

class UploadPart(BaseModel):

427

"""Part object representing a chunk of an upload."""

428

id: str

429

created_at: int

430

object: Literal["upload.part"]

431

upload_id: str

432

433

FilePurpose = Literal["assistants", "batch", "fine-tune", "vision"]

434

435

FileTypes = Union[

436

FileContent,

437

Tuple[Optional[str], FileContent],

438

Tuple[Optional[str], FileContent, Optional[str]]

439

]

440

441

class Omit:

442

"""Sentinel value for omitted parameters."""

443

```

444

445

## Access Pattern

446

447

```python

448

# Synchronous

449

from openai import OpenAI

450

client = OpenAI()

451

client.uploads.create(...)

452

client.uploads.complete(...)

453

client.uploads.cancel(...)

454

client.uploads.upload_file_chunked(...)

455

client.uploads.parts.create(...)

456

457

# Asynchronous

458

from openai import AsyncOpenAI

459

client = AsyncOpenAI()

460

await client.uploads.create(...)

461

await client.uploads.complete(...)

462

await client.uploads.cancel(...)

463

await client.uploads.upload_file_chunked(...)

464

await client.uploads.parts.create(...)

465

```

466