or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

bandwidth-management.mdconfiguration.mdcrt-support.mdexception-handling.mdfile-utilities.mdfutures-coordination.mdindex.mdlegacy-transfer.mdprocess-pool-downloads.mdsubscribers-callbacks.mdtransfer-manager.md

index.mddocs/

0

# S3Transfer

1

2

A Python library that provides high-level abstractions for efficient Amazon S3 uploads and downloads. S3Transfer handles multipart operations, parallel processing, bandwidth throttling, progress callbacks, and retry logic, making it the foundational transfer layer for boto3's S3 operations.

3

4

## Package Information

5

6

- **Package Name**: s3transfer

7

- **Language**: Python

8

- **Installation**: `pip install s3transfer`

9

10

## Core Imports

11

12

```python

13

import s3transfer

14

from s3transfer import S3Transfer, TransferConfig

15

```

16

17

Modern API (recommended):

18

19

```python

20

from s3transfer.manager import TransferManager, TransferConfig

21

```

22

23

## Basic Usage

24

25

### Legacy API

26

27

```python

28

import boto3

29

from s3transfer import S3Transfer, TransferConfig

30

31

# Create S3 client and transfer manager

32

client = boto3.client('s3', region_name='us-west-2')

33

transfer = S3Transfer(client)

34

35

# Upload a file

36

transfer.upload_file('/tmp/myfile.txt', 'my-bucket', 'myfile.txt')

37

38

# Download a file

39

transfer.download_file('my-bucket', 'myfile.txt', '/tmp/downloaded.txt')

40

41

# With configuration

42

config = TransferConfig(

43

multipart_threshold=8 * 1024 * 1024, # 8MB

44

max_concurrency=10,

45

num_download_attempts=5

46

)

47

transfer = S3Transfer(client, config)

48

```

49

50

### Modern API

51

52

```python

53

import boto3

54

from s3transfer.manager import TransferManager, TransferConfig

55

56

# Create transfer manager

57

client = boto3.client('s3', region_name='us-west-2')

58

config = TransferConfig(

59

multipart_threshold=8 * 1024 * 1024,

60

max_request_concurrency=10,

61

max_bandwidth=100 * 1024 * 1024 # 100MB/s

62

)

63

transfer_manager = TransferManager(client, config)

64

65

# Upload with progress tracking

66

with open('/tmp/myfile.txt', 'rb') as f:

67

future = transfer_manager.upload(f, 'my-bucket', 'myfile.txt')

68

future.result() # Wait for completion

69

70

# Download

71

with open('/tmp/downloaded.txt', 'wb') as f:

72

future = transfer_manager.download('my-bucket', 'myfile.txt', f)

73

future.result()

74

75

# Always shutdown when done

76

transfer_manager.shutdown()

77

```

78

79

## Architecture

80

81

S3Transfer is built around a two-tier API design:

82

83

- **High-level interfaces** (S3Transfer, TransferManager): Simple methods for common operations

84

- **Low-level components** (futures, coordinators, tasks): Fine-grained control and advanced features

85

- **Transfer coordination**: Future-based asynchronous execution with progress tracking

86

- **Bandwidth management**: Token bucket algorithms for transfer rate limiting

87

- **Error handling**: Comprehensive retry logic with exponential backoff

88

- **Multipart operations**: Automatic multipart uploads/downloads for large files

89

90

The modern TransferManager provides enhanced capabilities including better resource management, more flexible configuration, and improved progress tracking compared to the legacy S3Transfer class.

91

92

## Capabilities

93

94

### Legacy Transfer Interface

95

96

The original S3Transfer class providing simple upload and download operations with basic configuration and progress callbacks.

97

98

```python { .api }

99

class S3Transfer:

100

def __init__(self, client, config=None, osutil=None): ...

101

def upload_file(self, filename, bucket, key, callback=None, extra_args=None): ...

102

def download_file(self, bucket, key, filename, extra_args=None, callback=None): ...

103

```

104

105

[Legacy Transfer Interface](./legacy-transfer.md)

106

107

### Modern Transfer Manager

108

109

The recommended TransferManager class offering enhanced capabilities including upload/download/copy/delete operations, better resource management, and comprehensive configuration options.

110

111

```python { .api }

112

class TransferManager:

113

def __init__(self, client, config=None, osutil=None, executor_cls=None): ...

114

def upload(self, fileobj, bucket, key, extra_args=None, subscribers=None): ...

115

def download(self, bucket, key, fileobj, extra_args=None, subscribers=None): ...

116

def copy(self, copy_source, bucket, key, extra_args=None, subscribers=None, source_client=None): ...

117

def delete(self, bucket, key, extra_args=None, subscribers=None): ...

118

def shutdown(self, cancel=False, cancel_msg=''): ...

119

```

120

121

[Modern Transfer Manager](./transfer-manager.md)

122

123

### Configuration Management

124

125

Comprehensive configuration classes for controlling transfer behavior including thresholds, concurrency, retry settings, and bandwidth limits.

126

127

```python { .api }

128

class TransferConfig:

129

def __init__(self, multipart_threshold=8388608, max_concurrency=10, multipart_chunksize=8388608, num_download_attempts=5, max_io_queue=100): ...

130

131

class TransferConfig: # Modern version

132

def __init__(self, multipart_threshold=8388608, multipart_chunksize=8388608, max_request_concurrency=10, max_submission_concurrency=5, max_request_queue_size=1024, max_submission_queue_size=1024, max_io_queue_size=1024, io_chunksize=262144, num_download_attempts=5, max_in_memory_upload_chunks=10, max_in_memory_download_chunks=10, max_bandwidth=None): ...

133

```

134

135

[Configuration Management](./configuration.md)

136

137

### Future-based Coordination

138

139

Asynchronous transfer execution using futures, coordinators, and metadata tracking for monitoring transfer progress and handling completion.

140

141

```python { .api }

142

class TransferFuture:

143

def done(self) -> bool: ...

144

def result(self): ...

145

def cancel(self): ...

146

@property

147

def meta(self) -> TransferMeta: ...

148

149

class TransferMeta:

150

@property

151

def call_args(self): ...

152

@property

153

def transfer_id(self): ...

154

@property

155

def size(self): ...

156

```

157

158

[Future-based Coordination](./futures-coordination.md)

159

160

### File Utilities and Progress Tracking

161

162

File handling utilities including chunk readers, progress streams, and OS operations with callback support for monitoring transfer progress.

163

164

```python { .api }

165

class ReadFileChunk:

166

def __init__(self, fileobj, start_byte, chunk_size, full_file_size, callback=None, enable_callback=True): ...

167

@classmethod

168

def from_filename(cls, filename, start_byte, chunk_size, callback=None, enable_callback=True): ...

169

def read(self, amount=None): ...

170

def seek(self, where): ...

171

def enable_callback(self): ...

172

def disable_callback(self): ...

173

174

class StreamReaderProgress:

175

def __init__(self, stream, callback=None): ...

176

def read(self, *args, **kwargs): ...

177

```

178

179

[File Utilities and Progress Tracking](./file-utilities.md)

180

181

### Bandwidth Management

182

183

Comprehensive bandwidth limiting using leaky bucket algorithms and consumption scheduling for controlling transfer rates.

184

185

```python { .api }

186

class BandwidthLimiter:

187

def __init__(self, leaky_bucket, time_utils=None): ...

188

def get_bandwith_limited_stream(self, stream, transfer_coordinator): ...

189

190

class LeakyBucket:

191

def __init__(self, max_rate, time_utils=None): ...

192

def consume(self, amount, request_token): ...

193

```

194

195

[Bandwidth Management](./bandwidth-management.md)

196

197

### Event Subscribers and Callbacks

198

199

Extensible subscriber system for handling transfer events including progress updates, completion notifications, and error handling.

200

201

```python { .api }

202

class BaseSubscriber:

203

def on_queued(self, **kwargs): ...

204

def on_progress(self, bytes_transferred, **kwargs): ...

205

def on_done(self, **kwargs): ...

206

```

207

208

[Event Subscribers and Callbacks](./subscribers-callbacks.md)

209

210

### Exception Handling

211

212

Comprehensive exception classes for handling transfer failures, retry exhaustion, and coordination errors.

213

214

```python { .api }

215

class RetriesExceededError(Exception):

216

def __init__(self, last_exception): ...

217

@property

218

def last_exception(self): ...

219

220

class S3UploadFailedError(Exception): ...

221

class S3DownloadFailedError(Exception): ...

222

class TransferNotDoneError(Exception): ...

223

class FatalError(CancelledError): ...

224

```

225

226

[Exception Handling](./exception-handling.md)

227

228

### Process Pool Downloads

229

230

High-performance multiprocessing-based downloader for improved throughput by bypassing Python's GIL limitations.

231

232

```python { .api }

233

class ProcessPoolDownloader:

234

def __init__(self, client_kwargs=None, config=None): ...

235

def download_file(self, bucket, key, filename, extra_args=None, expected_size=None): ...

236

def shutdown(self): ...

237

def __enter__(self): ...

238

def __exit__(self, exc_type, exc_val, exc_tb): ...

239

240

class ProcessTransferConfig:

241

def __init__(self, multipart_threshold=8388608, multipart_chunksize=8388608, max_request_processes=10): ...

242

243

class ProcessPoolTransferFuture:

244

def done(self) -> bool: ...

245

def result(self): ...

246

def cancel(self): ...

247

@property

248

def meta(self): ...

249

250

class ProcessPoolTransferMeta:

251

@property

252

def call_args(self): ...

253

@property

254

def transfer_id(self): ...

255

```

256

257

[Process Pool Downloads](./process-pool-downloads.md)

258

259

### AWS Common Runtime (CRT) Support

260

261

High-performance transfer manager implementation using AWS Common Runtime for improved throughput and efficiency. Provides drop-in replacement for TransferManager with automatic throughput optimization.

262

263

```python { .api }

264

class CRTTransferManager:

265

def __init__(self, crt_s3_client, crt_request_serializer, osutil=None): ...

266

def upload(self, fileobj, bucket, key, extra_args=None, subscribers=None): ...

267

def download(self, bucket, key, fileobj, extra_args=None, subscribers=None): ...

268

def delete(self, bucket, key, extra_args=None, subscribers=None): ...

269

def shutdown(self, cancel=False): ...

270

def __enter__(self): ...

271

def __exit__(self, exc_type, exc_val, exc_tb): ...

272

273

class CRTTransferFuture:

274

def done(self) -> bool: ...

275

def result(self, timeout=None): ...

276

def cancel(self): ...

277

@property

278

def meta(self): ...

279

280

class BotocoreCRTRequestSerializer:

281

def __init__(self, session, region_name, signature_version='s3v4'): ...

282

def serialize_http_request(self, request_dict): ...

283

284

def create_s3_crt_client(region_name, num_threads=None, target_throughput=None, part_size=8388608, use_ssl=True, verify=None): ...

285

def acquire_crt_s3_process_lock(): ...

286

```

287

288

[AWS Common Runtime (CRT) Support](./crt-support.md)

289

290

### Callback Control Utilities

291

292

Global utility functions for controlling upload callback behavior in S3 operations.

293

294

```python { .api }

295

def disable_upload_callbacks(request, operation_name, **kwargs):

296

"""

297

Disable upload progress callbacks for S3 operations.

298

299

Args:

300

request: Boto3 request object

301

operation_name (str): Name of the S3 operation

302

**kwargs: Additional arguments

303

"""

304

305

def enable_upload_callbacks(request, operation_name, **kwargs):

306

"""

307

Enable upload progress callbacks for S3 operations.

308

309

Args:

310

request: Boto3 request object

311

operation_name (str): Name of the S3 operation

312

**kwargs: Additional arguments

313

"""

314

```

315

316

## Types

317

318

### Core Types

319

320

```python { .api }

321

# Callback function type for progress tracking

322

CallbackType = Callable[[int], None]

323

324

# Extra arguments dictionary for S3 operations

325

ExtraArgsType = Dict[str, Any]

326

327

# Subscriber list for event handling

328

SubscribersType = List[BaseSubscriber]

329

330

# Transfer source for copy operations

331

CopySourceType = Dict[str, str] # {'Bucket': str, 'Key': str, 'VersionId': str}

332

```

333

334

### Constants

335

336

```python { .api }

337

# Size constants

338

KB = 1024

339

MB = KB * KB

340

GB = MB * KB

341

342

# S3 limits

343

MAX_PARTS = 10000

344

MAX_SINGLE_UPLOAD_SIZE = 5 * GB

345

MIN_UPLOAD_CHUNKSIZE = 5 * MB

346

347

# Default configuration values

348

DEFAULT_MULTIPART_THRESHOLD = 8 * MB

349

DEFAULT_MULTIPART_CHUNKSIZE = 8 * MB

350

DEFAULT_MAX_CONCURRENCY = 10

351

352

# Allowed S3 operation arguments

353

ALLOWED_DOWNLOAD_ARGS: List[str]

354

ALLOWED_UPLOAD_ARGS: List[str]

355

ALLOWED_COPY_ARGS: List[str]

356

ALLOWED_DELETE_ARGS: List[str]

357

```