tessl/pypi-azure-storage-blob

Microsoft Azure Blob Storage Client Library for Python providing comprehensive APIs for blob storage operations.

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

Blob Types and Storage Tiers

Name: tessl/pypi-azure-storage-blob
Author: tessl

Azure Blob Storage supports three blob types optimized for different scenarios, along with access tiers for cost optimization. Each blob type provides specific capabilities for different data patterns and use cases.

Capabilities

Blob Types

Azure Blob Storage provides three distinct blob types, each optimized for specific data access patterns and scenarios.

class BlobType:
    """Blob type enumeration."""
    BLOCKBLOB: str    # Optimized for streaming and storing cloud objects
    PAGEBLOB: str     # Optimized for random read/write operations  
    APPENDBLOB: str   # Optimized for append operations

Block Blobs

Block blobs are optimized for streaming and storing cloud objects. They are ideal for documents, media files, backups, and general-purpose data storage.

Characteristics:

Up to 4.75 TB in size
Composed of blocks that can be managed individually
Support concurrent uploads for large files
Ideal for streaming scenarios and general file storage

Use Cases:

Documents, images, videos, and media files
Application data and backups
Web content and static assets
Log files that don't require append-only semantics

Page Blobs

Page blobs are optimized for random read and write operations. They serve as the backing storage for Azure Virtual Machine disks and support sparse data scenarios.

Characteristics:

Up to 8 TB in size
Optimized for random access patterns
512-byte page alignment required
Support for sparse data with efficient storage
Built-in sequence numbering for concurrency control

Use Cases:

Virtual machine disk images (VHD/VHDX)
Database files requiring random access
Sparse data files
Custom applications requiring random read/write patterns

Append Blobs

Append blobs are optimized for append operations, making them ideal for logging scenarios where data is continuously added.

Characteristics:

Up to 195 GB in size
Append-only operations (no updates to existing data)
Block-based structure optimized for sequential writes
Built-in support for concurrent append operations

Use Cases:

Application logs and audit trails
Streaming data ingestion
Time-series data
Any scenario requiring append-only semantics

Access Tiers

Access tiers provide cost optimization by aligning storage costs with data access patterns. Different tiers offer trade-offs between storage cost and access cost.

Standard Storage Tiers

Standard storage accounts support multiple access tiers for block blobs.

class StandardBlobTier:
    """Standard storage access tiers."""
    HOT: str      # Frequently accessed data
    COOL: str     # Infrequently accessed data (30+ days)
    COLD: str     # Rarely accessed data (90+ days)  
    ARCHIVE: str  # Long-term archived data (180+ days)

Hot Tier:

Highest storage cost, lowest access cost
Optimized for data accessed frequently
Default tier for new blobs
Immediate access with no rehydration required

Cool Tier:

Lower storage cost than Hot, higher access cost
Optimized for data stored for at least 30 days
Slightly higher latency than Hot tier
Immediate access with no rehydration required

Cold Tier:

Lower storage cost than Cool, higher access cost
Optimized for data stored for at least 90 days
Higher latency than Cool tier
Immediate access with no rehydration required

Archive Tier:

Lowest storage cost, highest access cost
Optimized for data stored for at least 180 days
Requires rehydration before access (hours to complete)
Not available for immediate access

Premium Storage Tiers

Premium storage accounts support performance tiers for page blobs, optimized for high IOPS and low latency scenarios.

class PremiumPageBlobTier:
    """Premium page blob performance tiers."""
    P4: str   # 25 IOPS per GiB
    P6: str   # 100 IOPS per GiB
    P10: str  # 500 IOPS per GiB
    P15: str  # 1,100 IOPS per GiB
    P20: str  # 2,300 IOPS per GiB
    P30: str  # 5,000 IOPS per GiB
    P40: str  # 7,500 IOPS per GiB
    P50: str  # 7,500 IOPS per GiB (larger size)
    P60: str  # 16,000 IOPS per GiB

Tier Management Operations

Set and modify access tiers for cost optimization and performance requirements.

# Set standard blob tier (available on BlobClient and ContainerClient)
def set_standard_blob_tier(self, standard_blob_tier, **kwargs) -> None:
    """
    Set the access tier for a standard storage blob.
    
    Args:
        standard_blob_tier (StandardBlobTier): Target access tier
        
    Optional Args:
        rehydrate_priority (RehydratePriority, optional): Priority for archive rehydration
        lease (BlobLeaseClient or str, optional): Required if blob has active lease
        version_id (str, optional): Specific version to modify
    """

# Set premium page blob tier (available on BlobClient)  
def set_premium_page_blob_tier(self, premium_page_blob_tier, **kwargs) -> None:
    """
    Set the performance tier for a premium page blob.
    
    Args:
        premium_page_blob_tier (PremiumPageBlobTier): Target premium tier
        
    Optional Args:
        lease (BlobLeaseClient or str, optional): Required if blob has active lease
    """

# Batch tier operations (available on ContainerClient)
def set_standard_blob_tier_blobs(self, *blobs, **kwargs) -> Iterator[HttpResponse]:
    """
    Set access tier for multiple standard blobs in batch.
    
    Args:
        *blobs: Tuples of (blob_name, standard_blob_tier) or BlobProperties with tier
        
    Returns:
        Iterator[HttpResponse]: Response for each tier operation
    """

def set_premium_page_blob_tier_blobs(self, *blobs, **kwargs) -> Iterator[HttpResponse]:
    """
    Set performance tier for multiple premium page blobs in batch.
    
    Args:
        *blobs: Tuples of (blob_name, premium_page_blob_tier) or BlobProperties with tier
        
    Returns:
        Iterator[HttpResponse]: Response for each tier operation
    """

Archive Rehydration

When accessing archived blobs, they must be rehydrated to an online tier before access is possible.

class RehydratePriority:
    """Archive rehydration priority levels."""
    Standard: str  # Standard rehydration (up to 15 hours)
    High: str      # High priority rehydration (up to 1 hour)

Rehydration Process:

# Rehydrate an archived blob to Hot tier with High priority
blob_client.set_standard_blob_tier(
    StandardBlobTier.HOT,
    rehydrate_priority=RehydratePriority.High
)

# Check rehydration status
properties = blob_client.get_blob_properties()
print(f"Archive Status: {properties.archive_status}")
print(f"Rehydrate Priority: {properties.rehydrate_priority}")

Tier Selection Guidelines

Choose Block Blobs When:

Storing documents, images, videos, or media files
Uploading large files that can benefit from concurrent block uploads
Need to stream data or serve web content
General-purpose cloud object storage scenarios

Choose Page Blobs When:

Storing virtual machine disk images
Need random read/write access patterns
Working with sparse data files
Building custom applications requiring 512-byte aligned access

Choose Append Blobs When:

Logging applications or audit trails
Streaming data ingestion scenarios
Time-series data collection
Any scenario requiring append-only operations

Choose Hot Tier When:

Data is accessed frequently (multiple times per month)
Application performance is critical
Cost of access is more important than storage cost

Choose Cool Tier When:

Data is accessed infrequently (once per month or less)
Data will be stored for at least 30 days
Balancing storage and access costs

Choose Cold Tier When:

Data is rarely accessed (few times per year)
Data will be stored for at least 90 days
Storage cost optimization is priority

Choose Archive Tier When:

Data is for long-term retention and compliance
Data will be stored for at least 180 days
Rarely or never accessed
Lowest storage cost is critical

Lifecycle Management

Automate tier transitions using lifecycle management policies to optimize costs over time:

# Example: Blob properties show current tier and tier change time
properties = blob_client.get_blob_properties()
print(f"Current Tier: {properties.blob_tier}")
print(f"Tier Change Time: {properties.blob_tier_change_time}")
print(f"Tier Inferred: {properties.blob_tier_inferred}")

Typical Lifecycle Pattern:

Hot → New data for active use
Cool → After 30 days of infrequent access
Cold → After 90 days of rare access
Archive → After 180+ days for long-term retention

Cost Optimization Strategies

Multi-Tier Strategy:

Use Hot tier for frequently accessed data
Move to Cool tier after 30 days
Move to Cold tier after 90 days
Archive after 180 days for compliance data

Performance Tier Strategy:

Use appropriate premium tier (P4-P60) based on IOPS requirements
Monitor performance metrics to optimize tier selection
Scale tier up/down based on workload demands

Blob Type Strategy:

Use Block blobs for general storage and streaming
Use Page blobs for VM disks and random access scenarios
Use Append blobs for logging and streaming ingestion

Install with Tessl CLI

npx tessl i tessl/pypi-azure-storage-blob

docs

tessl/pypi-azure-storage-blob

blob-types-tiers.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Blob Types and Storage Tiers

Capabilities

Blob Types

Block Blobs

Page Blobs

Append Blobs

Access Tiers

Standard Storage Tiers

Premium Storage Tiers

Tier Management Operations

Archive Rehydration

Tier Selection Guidelines

Choose Block Blobs When:

Choose Page Blobs When:

Choose Append Blobs When:

Choose Hot Tier When:

Choose Cool Tier When:

Choose Cold Tier When:

Choose Archive Tier When:

Lifecycle Management

Cost Optimization Strategies

blob-types-tiers.mddocs/