CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-s3transfer

An Amazon S3 Transfer Manager that provides high-level abstractions for efficient uploads/downloads with multipart transfers, progress callbacks, and retry logic.

Pending
Overview
Eval results
Files

configuration.mddocs/

Configuration Management

Comprehensive configuration classes for controlling S3 transfer behavior including multipart thresholds, concurrency limits, retry settings, bandwidth limits, and memory management.

Capabilities

Modern TransferConfig

The enhanced configuration class for TransferManager with comprehensive options for fine-tuning transfer behavior.

class TransferConfig:
    """
    Configuration for TransferManager operations with comprehensive options.
    
    Args:
        multipart_threshold (int): Size threshold for multipart transfers (default: 8MB)
        multipart_chunksize (int): Size of multipart chunks (default: 8MB)
        max_request_concurrency (int): Max concurrent S3 API requests (default: 10)
        max_submission_concurrency (int): Max concurrent task submissions (default: 5)
        max_request_queue_size (int): Max request queue size (default: 1024)
        max_submission_queue_size (int): Max submission queue size (default: 1024)
        max_io_queue_size (int): Max IO operation queue size (default: 1024)
        io_chunksize (int): IO chunk size for reading/writing (default: 256KB)
        num_download_attempts (int): Number of download retry attempts (default: 5)
        max_in_memory_upload_chunks (int): Max upload chunks held in memory (default: 10)
        max_in_memory_download_chunks (int): Max download chunks held in memory (default: 10)
        max_bandwidth (int, optional): Maximum bandwidth in bytes/second (default: None)
    """
    def __init__(
        self,
        multipart_threshold=8 * 1024 * 1024,
        multipart_chunksize=8 * 1024 * 1024,
        max_request_concurrency=10,
        max_submission_concurrency=5,
        max_request_queue_size=1024,
        max_submission_queue_size=1024,
        max_io_queue_size=1024,
        io_chunksize=256 * 1024,
        num_download_attempts=5,
        max_in_memory_upload_chunks=10,
        max_in_memory_download_chunks=10,
        max_bandwidth=None
    ): ...
    
    multipart_threshold: int
    multipart_chunksize: int
    max_request_concurrency: int
    max_submission_concurrency: int
    max_request_queue_size: int
    max_submission_queue_size: int
    max_io_queue_size: int
    io_chunksize: int
    num_download_attempts: int
    max_in_memory_upload_chunks: int
    max_in_memory_download_chunks: int
    max_bandwidth: Optional[int]

Legacy TransferConfig

The original configuration class for S3Transfer with basic options for multipart operations and concurrency.

class TransferConfig:
    """
    Legacy configuration for S3Transfer operations.
    
    Args:
        multipart_threshold (int): Size threshold for multipart uploads (default: 8MB)
        max_concurrency (int): Maximum number of concurrent transfers (default: 10)
        multipart_chunksize (int): Size of multipart chunks (default: 8MB)
        num_download_attempts (int): Number of download retry attempts (default: 5)
        max_io_queue (int): Maximum size of IO queue (default: 100)
    """
    def __init__(
        self,
        multipart_threshold=8 * 1024 * 1024,
        max_concurrency=10,
        multipart_chunksize=8 * 1024 * 1024,
        num_download_attempts=5,
        max_io_queue=100
    ): ...
    
    multipart_threshold: int
    max_concurrency: int
    multipart_chunksize: int
    num_download_attempts: int
    max_io_queue: int

Configuration Examples

High-Performance Configuration

Optimized for large files and high-bandwidth connections:

from s3transfer.manager import TransferConfig

# High-performance configuration for large files
high_perf_config = TransferConfig(
    multipart_threshold=64 * 1024 * 1024,      # 64MB - larger threshold
    multipart_chunksize=64 * 1024 * 1024,      # 64MB chunks
    max_request_concurrency=50,                 # High concurrency
    max_submission_concurrency=20,              # More task submissions
    max_request_queue_size=2048,                # Larger request queue
    max_submission_queue_size=2048,             # Larger submission queue
    max_io_queue_size=2048,                     # Larger IO queue
    io_chunksize=1024 * 1024,                   # 1MB IO chunks
    num_download_attempts=3,                    # Fewer retries (good connection)
    max_in_memory_upload_chunks=20,             # More memory usage
    max_in_memory_download_chunks=20,           # More memory usage
    max_bandwidth=1000 * 1024 * 1024            # 1GB/s limit
)

Memory-Constrained Configuration

Optimized for environments with limited memory:

# Low-memory configuration
low_memory_config = TransferConfig(
    multipart_threshold=16 * 1024 * 1024,       # 16MB threshold
    multipart_chunksize=8 * 1024 * 1024,        # 8MB chunks (smaller)
    max_request_concurrency=5,                   # Lower concurrency
    max_submission_concurrency=2,                # Fewer submissions
    max_request_queue_size=256,                  # Smaller queues
    max_submission_queue_size=256,
    max_io_queue_size=256,
    io_chunksize=64 * 1024,                      # 64KB IO chunks
    num_download_attempts=10,                    # More retries
    max_in_memory_upload_chunks=2,               # Minimal memory usage
    max_in_memory_download_chunks=2,
    max_bandwidth=10 * 1024 * 1024               # 10MB/s limit
)

Bandwidth-Limited Configuration

Optimized for slow or metered connections:

# Bandwidth-limited configuration
bandwidth_limited_config = TransferConfig(
    multipart_threshold=32 * 1024 * 1024,       # 32MB threshold
    multipart_chunksize=16 * 1024 * 1024,       # 16MB chunks
    max_request_concurrency=3,                   # Low concurrency
    max_submission_concurrency=1,                # Sequential submissions
    max_request_queue_size=128,                  # Small queues
    max_submission_queue_size=64,
    max_io_queue_size=128,
    io_chunksize=128 * 1024,                     # 128KB IO chunks
    num_download_attempts=15,                    # Many retries
    max_in_memory_upload_chunks=3,               # Conservative memory
    max_in_memory_download_chunks=3,
    max_bandwidth=1 * 1024 * 1024                # 1MB/s strict limit
)

Reliable Network Configuration

Optimized for unreliable networks with good bandwidth:

# Reliable network configuration
reliable_config = TransferConfig(
    multipart_threshold=32 * 1024 * 1024,       # 32MB threshold
    multipart_chunksize=16 * 1024 * 1024,       # 16MB chunks (smaller for retries)
    max_request_concurrency=15,                  # Good concurrency
    max_submission_concurrency=8,                # Balanced submissions
    max_request_queue_size=1024,                 # Standard queues
    max_submission_queue_size=512,
    max_io_queue_size=1024,
    io_chunksize=512 * 1024,                     # 512KB IO chunks
    num_download_attempts=20,                    # Many retries for reliability
    max_in_memory_upload_chunks=8,               # Balanced memory usage
    max_in_memory_download_chunks=8,
    max_bandwidth=None                           # No bandwidth limit
)

Configuration Parameters Explained

Multipart Settings

# Multipart threshold - when to switch to multipart uploads/downloads
multipart_threshold=8 * 1024 * 1024  # Files >= 8MB use multipart

# Chunk size for multipart operations
multipart_chunksize=8 * 1024 * 1024  # Each part is 8MB

Guidelines:

  • Larger thresholds reduce overhead for medium files
  • Smaller chunks enable better progress tracking and retry granularity
  • S3 minimum chunk size is 5MB (except last part)
  • S3 maximum parts per upload is 10,000

Concurrency Settings

# Maximum concurrent S3 API requests (upload_part, get_object, etc.)
max_request_concurrency=10

# Maximum concurrent task submissions to executor
max_submission_concurrency=5

Guidelines:

  • Higher request concurrency improves throughput but uses more resources
  • Balance with S3 request rate limits and local resource constraints
  • Submission concurrency should be lower than request concurrency

Queue Settings

# Maximum pending requests in various queues
max_request_queue_size=1024      # S3 API requests
max_submission_queue_size=1024   # Task submissions
max_io_queue_size=1024          # IO operations

Guidelines:

  • Larger queues provide more buffering but use more memory
  • Size based on expected peak load and available memory
  • IO queue affects download performance most significantly

Memory Management

# Maximum chunks held in memory simultaneously
max_in_memory_upload_chunks=10   # Upload chunks buffered
max_in_memory_download_chunks=10 # Download chunks buffered

# IO chunk size for reading/writing operations
io_chunksize=256 * 1024         # 256KB per IO operation

Guidelines:

  • More in-memory chunks improve performance but increase memory usage
  • Memory usage ≈ chunks × chunk_size × concurrency
  • Balance based on available RAM and performance needs

Retry and Reliability

# Number of download retry attempts
num_download_attempts=5

Guidelines:

  • More attempts improve reliability but increase latency on failures
  • Consider network reliability and timeout settings
  • Upload retries are handled by boto3/botocore

Bandwidth Management

# Maximum bandwidth in bytes per second
max_bandwidth=100 * 1024 * 1024  # 100MB/s limit

Guidelines:

  • Use for rate limiting in shared environments
  • Set to None for no bandwidth limits
  • Applies across all concurrent transfers

Configuration Validation

s3transfer performs automatic validation of configuration parameters:

# These will raise ValueError if invalid
config = TransferConfig(
    multipart_chunksize=1024,          # Too small (< 5MB)
    max_request_concurrency=0,         # Must be > 0
    num_download_attempts=0,           # Must be > 0
    max_bandwidth=-1                   # Must be None or > 0
)

Performance Tuning Guidelines

For Small Files (< 100MB)

config = TransferConfig(
    multipart_threshold=100 * 1024 * 1024,  # Higher threshold
    max_request_concurrency=20,              # Higher concurrency
    max_in_memory_upload_chunks=5,           # Lower memory usage
    max_in_memory_download_chunks=5
)

For Large Files (> 1GB)

config = TransferConfig(
    multipart_threshold=16 * 1024 * 1024,   # Lower threshold
    multipart_chunksize=64 * 1024 * 1024,   # Larger chunks
    max_request_concurrency=30,              # High concurrency
    max_in_memory_upload_chunks=15,          # More buffering
    max_in_memory_download_chunks=15
)

For Many Small Files

config = TransferConfig(
    multipart_threshold=64 * 1024 * 1024,   # Avoid multipart
    max_request_concurrency=50,              # Very high concurrency
    max_submission_concurrency=25,           # High submissions
    max_request_queue_size=4096,             # Large queues
    max_in_memory_upload_chunks=3,           # Low memory per transfer
    max_in_memory_download_chunks=3
)

Legacy vs Modern Configuration

Migration Mapping

Legacy ParameterModern EquivalentNotes
max_concurrencymax_request_concurrencySimilar purpose
max_io_queuemax_io_queue_sizeRenamed
N/Amax_submission_concurrencyNew parameter
N/Amax_request_queue_sizeNew parameter
N/Amax_submission_queue_sizeNew parameter
N/Aio_chunksizeNew parameter
N/Amax_in_memory_*_chunksNew parameters
N/Amax_bandwidthNew parameter

Migration Example

# Legacy configuration
from s3transfer import TransferConfig as LegacyConfig
legacy_config = LegacyConfig(
    multipart_threshold=16 * 1024 * 1024,
    max_concurrency=15,
    multipart_chunksize=16 * 1024 * 1024,
    num_download_attempts=10,
    max_io_queue=200
)

# Equivalent modern configuration
from s3transfer.manager import TransferConfig
modern_config = TransferConfig(
    multipart_threshold=16 * 1024 * 1024,
    max_request_concurrency=15,           # Was max_concurrency
    multipart_chunksize=16 * 1024 * 1024,
    num_download_attempts=10,
    max_io_queue_size=200,                # Was max_io_queue
    max_submission_concurrency=8,         # New - set appropriately
    max_bandwidth=None                    # New - no limit
)

Install with Tessl CLI

npx tessl i tessl/pypi-s3transfer

docs

bandwidth-management.md

configuration.md

crt-support.md

exception-handling.md

file-utilities.md

futures-coordination.md

index.md

legacy-transfer.md

process-pool-downloads.md

subscribers-callbacks.md

transfer-manager.md

tile.json