0
# Blob Types and Storage Tiers
1
2
Azure Blob Storage supports three blob types optimized for different scenarios, along with access tiers for cost optimization. Each blob type provides specific capabilities for different data patterns and use cases.
3
4
## Capabilities
5
6
### Blob Types
7
8
Azure Blob Storage provides three distinct blob types, each optimized for specific data access patterns and scenarios.
9
10
```python { .api }
11
class BlobType:
12
"""Blob type enumeration."""
13
BLOCKBLOB: str # Optimized for streaming and storing cloud objects
14
PAGEBLOB: str # Optimized for random read/write operations
15
APPENDBLOB: str # Optimized for append operations
16
```
17
18
#### Block Blobs
19
20
Block blobs are optimized for streaming and storing cloud objects. They are ideal for documents, media files, backups, and general-purpose data storage.
21
22
**Characteristics:**
23
- Up to 4.75 TB in size
24
- Composed of blocks that can be managed individually
25
- Support concurrent uploads for large files
26
- Ideal for streaming scenarios and general file storage
27
28
**Use Cases:**
29
- Documents, images, videos, and media files
30
- Application data and backups
31
- Web content and static assets
32
- Log files that don't require append-only semantics
33
34
#### Page Blobs
35
36
Page blobs are optimized for random read and write operations. They serve as the backing storage for Azure Virtual Machine disks and support sparse data scenarios.
37
38
**Characteristics:**
39
- Up to 8 TB in size
40
- Optimized for random access patterns
41
- 512-byte page alignment required
42
- Support for sparse data with efficient storage
43
- Built-in sequence numbering for concurrency control
44
45
**Use Cases:**
46
- Virtual machine disk images (VHD/VHDX)
47
- Database files requiring random access
48
- Sparse data files
49
- Custom applications requiring random read/write patterns
50
51
#### Append Blobs
52
53
Append blobs are optimized for append operations, making them ideal for logging scenarios where data is continuously added.
54
55
**Characteristics:**
56
- Up to 195 GB in size
57
- Append-only operations (no updates to existing data)
58
- Block-based structure optimized for sequential writes
59
- Built-in support for concurrent append operations
60
61
**Use Cases:**
62
- Application logs and audit trails
63
- Streaming data ingestion
64
- Time-series data
65
- Any scenario requiring append-only semantics
66
67
### Access Tiers
68
69
Access tiers provide cost optimization by aligning storage costs with data access patterns. Different tiers offer trade-offs between storage cost and access cost.
70
71
#### Standard Storage Tiers
72
73
Standard storage accounts support multiple access tiers for block blobs.
74
75
```python { .api }
76
class StandardBlobTier:
77
"""Standard storage access tiers."""
78
HOT: str # Frequently accessed data
79
COOL: str # Infrequently accessed data (30+ days)
80
COLD: str # Rarely accessed data (90+ days)
81
ARCHIVE: str # Long-term archived data (180+ days)
82
```
83
84
**Hot Tier:**
85
- Highest storage cost, lowest access cost
86
- Optimized for data accessed frequently
87
- Default tier for new blobs
88
- Immediate access with no rehydration required
89
90
**Cool Tier:**
91
- Lower storage cost than Hot, higher access cost
92
- Optimized for data stored for at least 30 days
93
- Slightly higher latency than Hot tier
94
- Immediate access with no rehydration required
95
96
**Cold Tier:**
97
- Lower storage cost than Cool, higher access cost
98
- Optimized for data stored for at least 90 days
99
- Higher latency than Cool tier
100
- Immediate access with no rehydration required
101
102
**Archive Tier:**
103
- Lowest storage cost, highest access cost
104
- Optimized for data stored for at least 180 days
105
- Requires rehydration before access (hours to complete)
106
- Not available for immediate access
107
108
#### Premium Storage Tiers
109
110
Premium storage accounts support performance tiers for page blobs, optimized for high IOPS and low latency scenarios.
111
112
```python { .api }
113
class PremiumPageBlobTier:
114
"""Premium page blob performance tiers."""
115
P4: str # 25 IOPS per GiB
116
P6: str # 100 IOPS per GiB
117
P10: str # 500 IOPS per GiB
118
P15: str # 1,100 IOPS per GiB
119
P20: str # 2,300 IOPS per GiB
120
P30: str # 5,000 IOPS per GiB
121
P40: str # 7,500 IOPS per GiB
122
P50: str # 7,500 IOPS per GiB (larger size)
123
P60: str # 16,000 IOPS per GiB
124
```
125
126
### Tier Management Operations
127
128
Set and modify access tiers for cost optimization and performance requirements.
129
130
```python { .api }
131
# Set standard blob tier (available on BlobClient and ContainerClient)
132
def set_standard_blob_tier(self, standard_blob_tier, **kwargs) -> None:
133
"""
134
Set the access tier for a standard storage blob.
135
136
Args:
137
standard_blob_tier (StandardBlobTier): Target access tier
138
139
Optional Args:
140
rehydrate_priority (RehydratePriority, optional): Priority for archive rehydration
141
lease (BlobLeaseClient or str, optional): Required if blob has active lease
142
version_id (str, optional): Specific version to modify
143
"""
144
145
# Set premium page blob tier (available on BlobClient)
146
def set_premium_page_blob_tier(self, premium_page_blob_tier, **kwargs) -> None:
147
"""
148
Set the performance tier for a premium page blob.
149
150
Args:
151
premium_page_blob_tier (PremiumPageBlobTier): Target premium tier
152
153
Optional Args:
154
lease (BlobLeaseClient or str, optional): Required if blob has active lease
155
"""
156
157
# Batch tier operations (available on ContainerClient)
158
def set_standard_blob_tier_blobs(self, *blobs, **kwargs) -> Iterator[HttpResponse]:
159
"""
160
Set access tier for multiple standard blobs in batch.
161
162
Args:
163
*blobs: Tuples of (blob_name, standard_blob_tier) or BlobProperties with tier
164
165
Returns:
166
Iterator[HttpResponse]: Response for each tier operation
167
"""
168
169
def set_premium_page_blob_tier_blobs(self, *blobs, **kwargs) -> Iterator[HttpResponse]:
170
"""
171
Set performance tier for multiple premium page blobs in batch.
172
173
Args:
174
*blobs: Tuples of (blob_name, premium_page_blob_tier) or BlobProperties with tier
175
176
Returns:
177
Iterator[HttpResponse]: Response for each tier operation
178
"""
179
```
180
181
### Archive Rehydration
182
183
When accessing archived blobs, they must be rehydrated to an online tier before access is possible.
184
185
```python { .api }
186
class RehydratePriority:
187
"""Archive rehydration priority levels."""
188
Standard: str # Standard rehydration (up to 15 hours)
189
High: str # High priority rehydration (up to 1 hour)
190
```
191
192
**Rehydration Process:**
193
```python
194
# Rehydrate an archived blob to Hot tier with High priority
195
blob_client.set_standard_blob_tier(
196
StandardBlobTier.HOT,
197
rehydrate_priority=RehydratePriority.High
198
)
199
200
# Check rehydration status
201
properties = blob_client.get_blob_properties()
202
print(f"Archive Status: {properties.archive_status}")
203
print(f"Rehydrate Priority: {properties.rehydrate_priority}")
204
```
205
206
### Tier Selection Guidelines
207
208
#### Choose Block Blobs When:
209
- Storing documents, images, videos, or media files
210
- Uploading large files that can benefit from concurrent block uploads
211
- Need to stream data or serve web content
212
- General-purpose cloud object storage scenarios
213
214
#### Choose Page Blobs When:
215
- Storing virtual machine disk images
216
- Need random read/write access patterns
217
- Working with sparse data files
218
- Building custom applications requiring 512-byte aligned access
219
220
#### Choose Append Blobs When:
221
- Logging applications or audit trails
222
- Streaming data ingestion scenarios
223
- Time-series data collection
224
- Any scenario requiring append-only operations
225
226
#### Choose Hot Tier When:
227
- Data is accessed frequently (multiple times per month)
228
- Application performance is critical
229
- Cost of access is more important than storage cost
230
231
#### Choose Cool Tier When:
232
- Data is accessed infrequently (once per month or less)
233
- Data will be stored for at least 30 days
234
- Balancing storage and access costs
235
236
#### Choose Cold Tier When:
237
- Data is rarely accessed (few times per year)
238
- Data will be stored for at least 90 days
239
- Storage cost optimization is priority
240
241
#### Choose Archive Tier When:
242
- Data is for long-term retention and compliance
243
- Data will be stored for at least 180 days
244
- Rarely or never accessed
245
- Lowest storage cost is critical
246
247
### Lifecycle Management
248
249
Automate tier transitions using lifecycle management policies to optimize costs over time:
250
251
```python
252
# Example: Blob properties show current tier and tier change time
253
properties = blob_client.get_blob_properties()
254
print(f"Current Tier: {properties.blob_tier}")
255
print(f"Tier Change Time: {properties.blob_tier_change_time}")
256
print(f"Tier Inferred: {properties.blob_tier_inferred}")
257
```
258
259
**Typical Lifecycle Pattern:**
260
1. **Hot** → New data for active use
261
2. **Cool** → After 30 days of infrequent access
262
3. **Cold** → After 90 days of rare access
263
4. **Archive** → After 180+ days for long-term retention
264
265
### Cost Optimization Strategies
266
267
**Multi-Tier Strategy:**
268
- Use Hot tier for frequently accessed data
269
- Move to Cool tier after 30 days
270
- Move to Cold tier after 90 days
271
- Archive after 180 days for compliance data
272
273
**Performance Tier Strategy:**
274
- Use appropriate premium tier (P4-P60) based on IOPS requirements
275
- Monitor performance metrics to optimize tier selection
276
- Scale tier up/down based on workload demands
277
278
**Blob Type Strategy:**
279
- Use Block blobs for general storage and streaming
280
- Use Page blobs for VM disks and random access scenarios
281
- Use Append blobs for logging and streaming ingestion