An Amazon S3 Transfer Manager that provides high-level abstractions for efficient uploads/downloads with multipart transfers, progress callbacks, and retry logic.
npx @tessl/cli install tessl/pypi-s3transfer@0.13.00
# S3Transfer
1
2
A Python library that provides high-level abstractions for efficient Amazon S3 uploads and downloads. S3Transfer handles multipart operations, parallel processing, bandwidth throttling, progress callbacks, and retry logic, making it the foundational transfer layer for boto3's S3 operations.
3
4
## Package Information
5
6
- **Package Name**: s3transfer
7
- **Language**: Python
8
- **Installation**: `pip install s3transfer`
9
10
## Core Imports
11
12
```python
13
import s3transfer
14
from s3transfer import S3Transfer, TransferConfig
15
```
16
17
Modern API (recommended):
18
19
```python
20
from s3transfer.manager import TransferManager, TransferConfig
21
```
22
23
## Basic Usage
24
25
### Legacy API
26
27
```python
28
import boto3
29
from s3transfer import S3Transfer, TransferConfig
30
31
# Create S3 client and transfer manager
32
client = boto3.client('s3', region_name='us-west-2')
33
transfer = S3Transfer(client)
34
35
# Upload a file
36
transfer.upload_file('/tmp/myfile.txt', 'my-bucket', 'myfile.txt')
37
38
# Download a file
39
transfer.download_file('my-bucket', 'myfile.txt', '/tmp/downloaded.txt')
40
41
# With configuration
42
config = TransferConfig(
43
multipart_threshold=8 * 1024 * 1024, # 8MB
44
max_concurrency=10,
45
num_download_attempts=5
46
)
47
transfer = S3Transfer(client, config)
48
```
49
50
### Modern API
51
52
```python
53
import boto3
54
from s3transfer.manager import TransferManager, TransferConfig
55
56
# Create transfer manager
57
client = boto3.client('s3', region_name='us-west-2')
58
config = TransferConfig(
59
multipart_threshold=8 * 1024 * 1024,
60
max_request_concurrency=10,
61
max_bandwidth=100 * 1024 * 1024 # 100MB/s
62
)
63
transfer_manager = TransferManager(client, config)
64
65
# Upload with progress tracking
66
with open('/tmp/myfile.txt', 'rb') as f:
67
future = transfer_manager.upload(f, 'my-bucket', 'myfile.txt')
68
future.result() # Wait for completion
69
70
# Download
71
with open('/tmp/downloaded.txt', 'wb') as f:
72
future = transfer_manager.download('my-bucket', 'myfile.txt', f)
73
future.result()
74
75
# Always shutdown when done
76
transfer_manager.shutdown()
77
```
78
79
## Architecture
80
81
S3Transfer is built around a two-tier API design:
82
83
- **High-level interfaces** (S3Transfer, TransferManager): Simple methods for common operations
84
- **Low-level components** (futures, coordinators, tasks): Fine-grained control and advanced features
85
- **Transfer coordination**: Future-based asynchronous execution with progress tracking
86
- **Bandwidth management**: Token bucket algorithms for transfer rate limiting
87
- **Error handling**: Comprehensive retry logic with exponential backoff
88
- **Multipart operations**: Automatic multipart uploads/downloads for large files
89
90
The modern TransferManager provides enhanced capabilities including better resource management, more flexible configuration, and improved progress tracking compared to the legacy S3Transfer class.
91
92
## Capabilities
93
94
### Legacy Transfer Interface
95
96
The original S3Transfer class providing simple upload and download operations with basic configuration and progress callbacks.
97
98
```python { .api }
99
class S3Transfer:
100
def __init__(self, client, config=None, osutil=None): ...
101
def upload_file(self, filename, bucket, key, callback=None, extra_args=None): ...
102
def download_file(self, bucket, key, filename, extra_args=None, callback=None): ...
103
```
104
105
[Legacy Transfer Interface](./legacy-transfer.md)
106
107
### Modern Transfer Manager
108
109
The recommended TransferManager class offering enhanced capabilities including upload/download/copy/delete operations, better resource management, and comprehensive configuration options.
110
111
```python { .api }
112
class TransferManager:
113
def __init__(self, client, config=None, osutil=None, executor_cls=None): ...
114
def upload(self, fileobj, bucket, key, extra_args=None, subscribers=None): ...
115
def download(self, bucket, key, fileobj, extra_args=None, subscribers=None): ...
116
def copy(self, copy_source, bucket, key, extra_args=None, subscribers=None, source_client=None): ...
117
def delete(self, bucket, key, extra_args=None, subscribers=None): ...
118
def shutdown(self, cancel=False, cancel_msg=''): ...
119
```
120
121
[Modern Transfer Manager](./transfer-manager.md)
122
123
### Configuration Management
124
125
Comprehensive configuration classes for controlling transfer behavior including thresholds, concurrency, retry settings, and bandwidth limits.
126
127
```python { .api }
128
class TransferConfig:
129
def __init__(self, multipart_threshold=8388608, max_concurrency=10, multipart_chunksize=8388608, num_download_attempts=5, max_io_queue=100): ...
130
131
class TransferConfig: # Modern version
132
def __init__(self, multipart_threshold=8388608, multipart_chunksize=8388608, max_request_concurrency=10, max_submission_concurrency=5, max_request_queue_size=1024, max_submission_queue_size=1024, max_io_queue_size=1024, io_chunksize=262144, num_download_attempts=5, max_in_memory_upload_chunks=10, max_in_memory_download_chunks=10, max_bandwidth=None): ...
133
```
134
135
[Configuration Management](./configuration.md)
136
137
### Future-based Coordination
138
139
Asynchronous transfer execution using futures, coordinators, and metadata tracking for monitoring transfer progress and handling completion.
140
141
```python { .api }
142
class TransferFuture:
143
def done(self) -> bool: ...
144
def result(self): ...
145
def cancel(self): ...
146
@property
147
def meta(self) -> TransferMeta: ...
148
149
class TransferMeta:
150
@property
151
def call_args(self): ...
152
@property
153
def transfer_id(self): ...
154
@property
155
def size(self): ...
156
```
157
158
[Future-based Coordination](./futures-coordination.md)
159
160
### File Utilities and Progress Tracking
161
162
File handling utilities including chunk readers, progress streams, and OS operations with callback support for monitoring transfer progress.
163
164
```python { .api }
165
class ReadFileChunk:
166
def __init__(self, fileobj, start_byte, chunk_size, full_file_size, callback=None, enable_callback=True): ...
167
@classmethod
168
def from_filename(cls, filename, start_byte, chunk_size, callback=None, enable_callback=True): ...
169
def read(self, amount=None): ...
170
def seek(self, where): ...
171
def enable_callback(self): ...
172
def disable_callback(self): ...
173
174
class StreamReaderProgress:
175
def __init__(self, stream, callback=None): ...
176
def read(self, *args, **kwargs): ...
177
```
178
179
[File Utilities and Progress Tracking](./file-utilities.md)
180
181
### Bandwidth Management
182
183
Comprehensive bandwidth limiting using leaky bucket algorithms and consumption scheduling for controlling transfer rates.
184
185
```python { .api }
186
class BandwidthLimiter:
187
def __init__(self, leaky_bucket, time_utils=None): ...
188
def get_bandwith_limited_stream(self, stream, transfer_coordinator): ...
189
190
class LeakyBucket:
191
def __init__(self, max_rate, time_utils=None): ...
192
def consume(self, amount, request_token): ...
193
```
194
195
[Bandwidth Management](./bandwidth-management.md)
196
197
### Event Subscribers and Callbacks
198
199
Extensible subscriber system for handling transfer events including progress updates, completion notifications, and error handling.
200
201
```python { .api }
202
class BaseSubscriber:
203
def on_queued(self, **kwargs): ...
204
def on_progress(self, bytes_transferred, **kwargs): ...
205
def on_done(self, **kwargs): ...
206
```
207
208
[Event Subscribers and Callbacks](./subscribers-callbacks.md)
209
210
### Exception Handling
211
212
Comprehensive exception classes for handling transfer failures, retry exhaustion, and coordination errors.
213
214
```python { .api }
215
class RetriesExceededError(Exception):
216
def __init__(self, last_exception): ...
217
@property
218
def last_exception(self): ...
219
220
class S3UploadFailedError(Exception): ...
221
class S3DownloadFailedError(Exception): ...
222
class TransferNotDoneError(Exception): ...
223
class FatalError(CancelledError): ...
224
```
225
226
[Exception Handling](./exception-handling.md)
227
228
### Process Pool Downloads
229
230
High-performance multiprocessing-based downloader for improved throughput by bypassing Python's GIL limitations.
231
232
```python { .api }
233
class ProcessPoolDownloader:
234
def __init__(self, client_kwargs=None, config=None): ...
235
def download_file(self, bucket, key, filename, extra_args=None, expected_size=None): ...
236
def shutdown(self): ...
237
def __enter__(self): ...
238
def __exit__(self, exc_type, exc_val, exc_tb): ...
239
240
class ProcessTransferConfig:
241
def __init__(self, multipart_threshold=8388608, multipart_chunksize=8388608, max_request_processes=10): ...
242
243
class ProcessPoolTransferFuture:
244
def done(self) -> bool: ...
245
def result(self): ...
246
def cancel(self): ...
247
@property
248
def meta(self): ...
249
250
class ProcessPoolTransferMeta:
251
@property
252
def call_args(self): ...
253
@property
254
def transfer_id(self): ...
255
```
256
257
[Process Pool Downloads](./process-pool-downloads.md)
258
259
### AWS Common Runtime (CRT) Support
260
261
High-performance transfer manager implementation using AWS Common Runtime for improved throughput and efficiency. Provides drop-in replacement for TransferManager with automatic throughput optimization.
262
263
```python { .api }
264
class CRTTransferManager:
265
def __init__(self, crt_s3_client, crt_request_serializer, osutil=None): ...
266
def upload(self, fileobj, bucket, key, extra_args=None, subscribers=None): ...
267
def download(self, bucket, key, fileobj, extra_args=None, subscribers=None): ...
268
def delete(self, bucket, key, extra_args=None, subscribers=None): ...
269
def shutdown(self, cancel=False): ...
270
def __enter__(self): ...
271
def __exit__(self, exc_type, exc_val, exc_tb): ...
272
273
class CRTTransferFuture:
274
def done(self) -> bool: ...
275
def result(self, timeout=None): ...
276
def cancel(self): ...
277
@property
278
def meta(self): ...
279
280
class BotocoreCRTRequestSerializer:
281
def __init__(self, session, region_name, signature_version='s3v4'): ...
282
def serialize_http_request(self, request_dict): ...
283
284
def create_s3_crt_client(region_name, num_threads=None, target_throughput=None, part_size=8388608, use_ssl=True, verify=None): ...
285
def acquire_crt_s3_process_lock(): ...
286
```
287
288
[AWS Common Runtime (CRT) Support](./crt-support.md)
289
290
### Callback Control Utilities
291
292
Global utility functions for controlling upload callback behavior in S3 operations.
293
294
```python { .api }
295
def disable_upload_callbacks(request, operation_name, **kwargs):
296
"""
297
Disable upload progress callbacks for S3 operations.
298
299
Args:
300
request: Boto3 request object
301
operation_name (str): Name of the S3 operation
302
**kwargs: Additional arguments
303
"""
304
305
def enable_upload_callbacks(request, operation_name, **kwargs):
306
"""
307
Enable upload progress callbacks for S3 operations.
308
309
Args:
310
request: Boto3 request object
311
operation_name (str): Name of the S3 operation
312
**kwargs: Additional arguments
313
"""
314
```
315
316
## Types
317
318
### Core Types
319
320
```python { .api }
321
# Callback function type for progress tracking
322
CallbackType = Callable[[int], None]
323
324
# Extra arguments dictionary for S3 operations
325
ExtraArgsType = Dict[str, Any]
326
327
# Subscriber list for event handling
328
SubscribersType = List[BaseSubscriber]
329
330
# Transfer source for copy operations
331
CopySourceType = Dict[str, str] # {'Bucket': str, 'Key': str, 'VersionId': str}
332
```
333
334
### Constants
335
336
```python { .api }
337
# Size constants
338
KB = 1024
339
MB = KB * KB
340
GB = MB * KB
341
342
# S3 limits
343
MAX_PARTS = 10000
344
MAX_SINGLE_UPLOAD_SIZE = 5 * GB
345
MIN_UPLOAD_CHUNKSIZE = 5 * MB
346
347
# Default configuration values
348
DEFAULT_MULTIPART_THRESHOLD = 8 * MB
349
DEFAULT_MULTIPART_CHUNKSIZE = 8 * MB
350
DEFAULT_MAX_CONCURRENCY = 10
351
352
# Allowed S3 operation arguments
353
ALLOWED_DOWNLOAD_ARGS: List[str]
354
ALLOWED_UPLOAD_ARGS: List[str]
355
ALLOWED_COPY_ARGS: List[str]
356
ALLOWED_DELETE_ARGS: List[str]
357
```