CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-azure-storage-blob

Microsoft Azure Blob Storage Client Library for Python providing comprehensive APIs for blob storage operations.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

blob-types-tiers.mddocs/

Blob Types and Storage Tiers

Azure Blob Storage supports three blob types optimized for different scenarios, along with access tiers for cost optimization. Each blob type provides specific capabilities for different data patterns and use cases.

Capabilities

Blob Types

Azure Blob Storage provides three distinct blob types, each optimized for specific data access patterns and scenarios.

class BlobType:
    """Blob type enumeration."""
    BLOCKBLOB: str    # Optimized for streaming and storing cloud objects
    PAGEBLOB: str     # Optimized for random read/write operations  
    APPENDBLOB: str   # Optimized for append operations

Block Blobs

Block blobs are optimized for streaming and storing cloud objects. They are ideal for documents, media files, backups, and general-purpose data storage.

Characteristics:

  • Up to 4.75 TB in size
  • Composed of blocks that can be managed individually
  • Support concurrent uploads for large files
  • Ideal for streaming scenarios and general file storage

Use Cases:

  • Documents, images, videos, and media files
  • Application data and backups
  • Web content and static assets
  • Log files that don't require append-only semantics

Page Blobs

Page blobs are optimized for random read and write operations. They serve as the backing storage for Azure Virtual Machine disks and support sparse data scenarios.

Characteristics:

  • Up to 8 TB in size
  • Optimized for random access patterns
  • 512-byte page alignment required
  • Support for sparse data with efficient storage
  • Built-in sequence numbering for concurrency control

Use Cases:

  • Virtual machine disk images (VHD/VHDX)
  • Database files requiring random access
  • Sparse data files
  • Custom applications requiring random read/write patterns

Append Blobs

Append blobs are optimized for append operations, making them ideal for logging scenarios where data is continuously added.

Characteristics:

  • Up to 195 GB in size
  • Append-only operations (no updates to existing data)
  • Block-based structure optimized for sequential writes
  • Built-in support for concurrent append operations

Use Cases:

  • Application logs and audit trails
  • Streaming data ingestion
  • Time-series data
  • Any scenario requiring append-only semantics

Access Tiers

Access tiers provide cost optimization by aligning storage costs with data access patterns. Different tiers offer trade-offs between storage cost and access cost.

Standard Storage Tiers

Standard storage accounts support multiple access tiers for block blobs.

class StandardBlobTier:
    """Standard storage access tiers."""
    HOT: str      # Frequently accessed data
    COOL: str     # Infrequently accessed data (30+ days)
    COLD: str     # Rarely accessed data (90+ days)  
    ARCHIVE: str  # Long-term archived data (180+ days)

Hot Tier:

  • Highest storage cost, lowest access cost
  • Optimized for data accessed frequently
  • Default tier for new blobs
  • Immediate access with no rehydration required

Cool Tier:

  • Lower storage cost than Hot, higher access cost
  • Optimized for data stored for at least 30 days
  • Slightly higher latency than Hot tier
  • Immediate access with no rehydration required

Cold Tier:

  • Lower storage cost than Cool, higher access cost
  • Optimized for data stored for at least 90 days
  • Higher latency than Cool tier
  • Immediate access with no rehydration required

Archive Tier:

  • Lowest storage cost, highest access cost
  • Optimized for data stored for at least 180 days
  • Requires rehydration before access (hours to complete)
  • Not available for immediate access

Premium Storage Tiers

Premium storage accounts support performance tiers for page blobs, optimized for high IOPS and low latency scenarios.

class PremiumPageBlobTier:
    """Premium page blob performance tiers."""
    P4: str   # 25 IOPS per GiB
    P6: str   # 100 IOPS per GiB
    P10: str  # 500 IOPS per GiB
    P15: str  # 1,100 IOPS per GiB
    P20: str  # 2,300 IOPS per GiB
    P30: str  # 5,000 IOPS per GiB
    P40: str  # 7,500 IOPS per GiB
    P50: str  # 7,500 IOPS per GiB (larger size)
    P60: str  # 16,000 IOPS per GiB

Tier Management Operations

Set and modify access tiers for cost optimization and performance requirements.

# Set standard blob tier (available on BlobClient and ContainerClient)
def set_standard_blob_tier(self, standard_blob_tier, **kwargs) -> None:
    """
    Set the access tier for a standard storage blob.
    
    Args:
        standard_blob_tier (StandardBlobTier): Target access tier
        
    Optional Args:
        rehydrate_priority (RehydratePriority, optional): Priority for archive rehydration
        lease (BlobLeaseClient or str, optional): Required if blob has active lease
        version_id (str, optional): Specific version to modify
    """

# Set premium page blob tier (available on BlobClient)  
def set_premium_page_blob_tier(self, premium_page_blob_tier, **kwargs) -> None:
    """
    Set the performance tier for a premium page blob.
    
    Args:
        premium_page_blob_tier (PremiumPageBlobTier): Target premium tier
        
    Optional Args:
        lease (BlobLeaseClient or str, optional): Required if blob has active lease
    """

# Batch tier operations (available on ContainerClient)
def set_standard_blob_tier_blobs(self, *blobs, **kwargs) -> Iterator[HttpResponse]:
    """
    Set access tier for multiple standard blobs in batch.
    
    Args:
        *blobs: Tuples of (blob_name, standard_blob_tier) or BlobProperties with tier
        
    Returns:
        Iterator[HttpResponse]: Response for each tier operation
    """

def set_premium_page_blob_tier_blobs(self, *blobs, **kwargs) -> Iterator[HttpResponse]:
    """
    Set performance tier for multiple premium page blobs in batch.
    
    Args:
        *blobs: Tuples of (blob_name, premium_page_blob_tier) or BlobProperties with tier
        
    Returns:
        Iterator[HttpResponse]: Response for each tier operation
    """

Archive Rehydration

When accessing archived blobs, they must be rehydrated to an online tier before access is possible.

class RehydratePriority:
    """Archive rehydration priority levels."""
    Standard: str  # Standard rehydration (up to 15 hours)
    High: str      # High priority rehydration (up to 1 hour)

Rehydration Process:

# Rehydrate an archived blob to Hot tier with High priority
blob_client.set_standard_blob_tier(
    StandardBlobTier.HOT,
    rehydrate_priority=RehydratePriority.High
)

# Check rehydration status
properties = blob_client.get_blob_properties()
print(f"Archive Status: {properties.archive_status}")
print(f"Rehydrate Priority: {properties.rehydrate_priority}")

Tier Selection Guidelines

Choose Block Blobs When:

  • Storing documents, images, videos, or media files
  • Uploading large files that can benefit from concurrent block uploads
  • Need to stream data or serve web content
  • General-purpose cloud object storage scenarios

Choose Page Blobs When:

  • Storing virtual machine disk images
  • Need random read/write access patterns
  • Working with sparse data files
  • Building custom applications requiring 512-byte aligned access

Choose Append Blobs When:

  • Logging applications or audit trails
  • Streaming data ingestion scenarios
  • Time-series data collection
  • Any scenario requiring append-only operations

Choose Hot Tier When:

  • Data is accessed frequently (multiple times per month)
  • Application performance is critical
  • Cost of access is more important than storage cost

Choose Cool Tier When:

  • Data is accessed infrequently (once per month or less)
  • Data will be stored for at least 30 days
  • Balancing storage and access costs

Choose Cold Tier When:

  • Data is rarely accessed (few times per year)
  • Data will be stored for at least 90 days
  • Storage cost optimization is priority

Choose Archive Tier When:

  • Data is for long-term retention and compliance
  • Data will be stored for at least 180 days
  • Rarely or never accessed
  • Lowest storage cost is critical

Lifecycle Management

Automate tier transitions using lifecycle management policies to optimize costs over time:

# Example: Blob properties show current tier and tier change time
properties = blob_client.get_blob_properties()
print(f"Current Tier: {properties.blob_tier}")
print(f"Tier Change Time: {properties.blob_tier_change_time}")
print(f"Tier Inferred: {properties.blob_tier_inferred}")

Typical Lifecycle Pattern:

  1. Hot → New data for active use
  2. Cool → After 30 days of infrequent access
  3. Cold → After 90 days of rare access
  4. Archive → After 180+ days for long-term retention

Cost Optimization Strategies

Multi-Tier Strategy:

  • Use Hot tier for frequently accessed data
  • Move to Cool tier after 30 days
  • Move to Cold tier after 90 days
  • Archive after 180 days for compliance data

Performance Tier Strategy:

  • Use appropriate premium tier (P4-P60) based on IOPS requirements
  • Monitor performance metrics to optimize tier selection
  • Scale tier up/down based on workload demands

Blob Type Strategy:

  • Use Block blobs for general storage and streaming
  • Use Page blobs for VM disks and random access scenarios
  • Use Append blobs for logging and streaming ingestion

Install with Tessl CLI

npx tessl i tessl/pypi-azure-storage-blob

docs

async-operations.md

blob-client.md

blob-types-tiers.md

container-client.md

index.md

sas-generation.md

service-client.md

utility-functions.md

tile.json