0
# Models and Types
1
2
Core data models, properties, permissions, and configuration classes used throughout the Azure Storage File DataLake SDK. These types provide structured representations of resources, metadata, and operational results.
3
4
## Capabilities
5
6
### Resource Properties
7
8
Core property classes that represent the state and metadata of Data Lake Storage resources.
9
10
```python { .api }
11
class FileSystemProperties:
12
"""
13
Properties of a file system.
14
15
Attributes:
16
name (str): Name of the file system
17
last_modified (datetime): Last modified timestamp
18
etag (str): ETag of the file system
19
lease_status (str): Current lease status
20
lease_state (str): Current lease state
21
lease_duration (str): Lease duration type
22
public_access (PublicAccess): Public access level
23
has_immutability_policy (bool): Whether immutability policy is set
24
has_legal_hold (bool): Whether legal hold is active
25
metadata (Dict[str, str]): User-defined metadata
26
encryption_scope (EncryptionScopeOptions): Default encryption scope
27
deleted_time (datetime): Deletion timestamp (for soft-deleted file systems)
28
remaining_retention_days (int): Days remaining in retention period
29
"""
30
31
class DirectoryProperties:
32
"""
33
Properties of a directory.
34
35
Attributes:
36
name (str): Name/path of the directory
37
last_modified (datetime): Last modified timestamp
38
etag (str): ETag of the directory
39
permissions (str): POSIX permissions in octal format
40
owner (str): Owner ID or principal name
41
group (str): Group ID or principal name
42
acl (str): Access control list in POSIX format
43
lease_status (str): Current lease status
44
lease_state (str): Current lease state
45
lease_duration (str): Lease duration type
46
metadata (Dict[str, str]): User-defined metadata
47
"""
48
49
class FileProperties:
50
"""
51
Properties of a file.
52
53
Attributes:
54
name (str): Name/path of the file
55
size (int): Size of the file in bytes
56
last_modified (datetime): Last modified timestamp
57
etag (str): ETag of the file
58
permissions (str): POSIX permissions in octal format
59
owner (str): Owner ID or principal name
60
group (str): Group ID or principal name
61
acl (str): Access control list in POSIX format
62
lease_status (str): Current lease status
63
lease_state (str): Current lease state
64
lease_duration (str): Lease duration type
65
content_settings (ContentSettings): Content-related settings
66
metadata (Dict[str, str]): User-defined metadata
67
creation_time (datetime): File creation timestamp
68
expiry_time (datetime): File expiration timestamp
69
encryption_context (str): Encryption context
70
"""
71
72
class PathProperties:
73
"""
74
Properties of a path (file or directory).
75
76
Attributes:
77
name (str): Name/path of the item
78
last_modified (datetime): Last modified timestamp
79
etag (str): ETag of the item
80
content_length (int): Size in bytes (0 for directories)
81
is_directory (bool): Whether the path is a directory
82
owner (str): Owner ID or principal name
83
group (str): Group ID or principal name
84
permissions (str): POSIX permissions in octal format
85
acl (str): Access control list in POSIX format
86
metadata (Dict[str, str]): User-defined metadata
87
creation_time (datetime): Creation timestamp
88
expiry_time (datetime): Expiration timestamp
89
encryption_context (str): Encryption context
90
"""
91
92
class DeletedPathProperties:
93
"""
94
Properties of a soft-deleted path.
95
96
Attributes:
97
name (str): Name of the deleted path
98
deleted_time (datetime): Deletion timestamp
99
remaining_retention_days (int): Days remaining in retention period
100
deletion_id (str): Unique identifier for the deletion
101
"""
102
```
103
104
### Content and Configuration
105
106
Classes for managing content settings, metadata, and operational configurations.
107
108
```python { .api }
109
class ContentSettings:
110
"""
111
Content settings for files including MIME type and encoding information.
112
113
Attributes:
114
content_type (str): MIME type of the content
115
content_encoding (str): Content encoding (e.g., 'gzip')
116
content_language (str): Content language (e.g., 'en-US')
117
content_disposition (str): Content disposition header
118
cache_control (str): Cache control directives
119
content_md5 (bytes): MD5 hash of the content
120
"""
121
122
def __init__(
123
self,
124
content_type: str = None,
125
content_encoding: str = None,
126
content_language: str = None,
127
content_disposition: str = None,
128
cache_control: str = None,
129
content_md5: bytes = None
130
):
131
"""Initialize content settings."""
132
133
class CustomerProvidedEncryptionKey:
134
"""
135
Customer-provided encryption key for client-side encryption.
136
137
Attributes:
138
key_value (str): Base64-encoded encryption key
139
key_hash (str): Base64-encoded SHA256 hash of the key
140
algorithm (str): Encryption algorithm (AES256)
141
"""
142
143
def __init__(
144
self,
145
key_value: str,
146
key_hash: str = None,
147
algorithm: str = "AES256"
148
):
149
"""Initialize customer-provided encryption key."""
150
151
class EncryptionScopeOptions:
152
"""
153
Encryption scope configuration for server-side encryption.
154
155
Attributes:
156
default_encryption_scope (str): Default encryption scope name
157
prevent_encryption_scope_override (bool): Whether to prevent scope override
158
"""
159
160
def __init__(
161
self,
162
default_encryption_scope: str,
163
prevent_encryption_scope_override: bool = False
164
):
165
"""Initialize encryption scope options."""
166
```
167
168
### Service Configuration
169
170
Classes for configuring account-level service properties including analytics, CORS, and retention policies.
171
172
```python { .api }
173
class AnalyticsLogging:
174
"""
175
Analytics logging configuration for the storage account.
176
177
Attributes:
178
version (str): Analytics version
179
delete (bool): Log delete operations
180
read (bool): Log read operations
181
write (bool): Log write operations
182
retention_policy (RetentionPolicy): Log retention policy
183
"""
184
185
def __init__(
186
self,
187
version: str = "1.0",
188
delete: bool = False,
189
read: bool = False,
190
write: bool = False,
191
retention_policy: 'RetentionPolicy' = None
192
):
193
"""Initialize analytics logging configuration."""
194
195
class Metrics:
196
"""
197
Metrics configuration for the storage account.
198
199
Attributes:
200
version (str): Metrics version
201
enabled (bool): Whether metrics are enabled
202
include_apis (bool): Include API-level metrics
203
retention_policy (RetentionPolicy): Metrics retention policy
204
"""
205
206
def __init__(
207
self,
208
version: str = "1.0",
209
enabled: bool = False,
210
include_apis: bool = None,
211
retention_policy: 'RetentionPolicy' = None
212
):
213
"""Initialize metrics configuration."""
214
215
class CorsRule:
216
"""
217
Cross-Origin Resource Sharing (CORS) rule configuration.
218
219
Attributes:
220
allowed_origins (List[str]): Allowed origin domains
221
allowed_methods (List[str]): Allowed HTTP methods
222
allowed_headers (List[str]): Allowed request headers
223
exposed_headers (List[str]): Headers exposed to client
224
max_age_in_seconds (int): Preflight request cache duration
225
"""
226
227
def __init__(
228
self,
229
allowed_origins: List[str],
230
allowed_methods: List[str],
231
allowed_headers: List[str] = None,
232
exposed_headers: List[str] = None,
233
max_age_in_seconds: int = 0
234
):
235
"""Initialize CORS rule."""
236
237
class RetentionPolicy:
238
"""
239
Data retention policy configuration.
240
241
Attributes:
242
enabled (bool): Whether retention policy is enabled
243
days (int): Number of days to retain data
244
"""
245
246
def __init__(self, enabled: bool = False, days: int = None):
247
"""Initialize retention policy."""
248
249
class StaticWebsite:
250
"""
251
Static website hosting configuration.
252
253
Attributes:
254
enabled (bool): Whether static website hosting is enabled
255
index_document (str): Default index document name
256
error_document404_path (str): Path to 404 error document
257
default_index_document_path (str): Default index document path
258
"""
259
260
def __init__(
261
self,
262
enabled: bool = False,
263
index_document: str = None,
264
error_document404_path: str = None,
265
default_index_document_path: str = None
266
):
267
"""Initialize static website configuration."""
268
```
269
270
### Access Control and Policies
271
272
Classes for managing access policies, delegation keys, and permission structures.
273
274
```python { .api }
275
class AccessPolicy:
276
"""
277
Stored access policy for signed identifiers.
278
279
Attributes:
280
permission (str): Permissions granted by the policy
281
expiry (datetime): Policy expiration time
282
start (datetime): Policy start time
283
"""
284
285
def __init__(
286
self,
287
permission: str = None,
288
expiry: datetime = None,
289
start: datetime = None
290
):
291
"""Initialize access policy."""
292
293
class UserDelegationKey:
294
"""
295
User delegation key for generating user delegation SAS tokens.
296
297
Attributes:
298
signed_oid (str): Object ID of the user
299
signed_tid (str): Tenant ID
300
signed_start (datetime): Key validity start time
301
signed_expiry (datetime): Key validity end time
302
signed_service (str): Storage service
303
signed_version (str): Service version
304
value (str): Base64-encoded key value
305
"""
306
307
class LeaseProperties:
308
"""
309
Properties of a lease on a resource.
310
311
Attributes:
312
status (str): Lease status (locked/unlocked)
313
state (str): Lease state (available/leased/expired/breaking/broken)
314
duration (str): Lease duration (infinite/fixed)
315
"""
316
```
317
318
### Query and Serialization
319
320
Classes for configuring file querying and data serialization formats.
321
322
```python { .api }
323
class QuickQueryDialect:
324
"""
325
Base class for query dialect configuration.
326
"""
327
328
class DelimitedTextDialect(QuickQueryDialect):
329
"""
330
Configuration for CSV/delimited text querying.
331
332
Attributes:
333
delimiter (str): Field delimiter character
334
quote_char (str): Quote character for fields
335
escape_char (str): Escape character
336
line_terminator (str): Line termination character(s)
337
has_header (bool): Whether first row contains headers
338
"""
339
340
def __init__(
341
self,
342
delimiter: str = ",",
343
quote_char: str = '"',
344
escape_char: str = "",
345
line_terminator: str = "\n",
346
has_header: bool = False
347
):
348
"""Initialize delimited text dialect."""
349
350
class DelimitedJsonDialect(QuickQueryDialect):
351
"""
352
Configuration for JSON Lines querying.
353
354
Attributes:
355
line_terminator (str): Line termination character(s)
356
"""
357
358
def __init__(self, line_terminator: str = "\n"):
359
"""Initialize JSON dialect."""
360
361
class ArrowDialect(QuickQueryDialect):
362
"""
363
Configuration for Apache Arrow format querying.
364
"""
365
366
class ArrowType:
367
"""
368
Apache Arrow data type specifications.
369
370
Attributes:
371
BOOL: Boolean type
372
INT8: 8-bit integer type
373
INT16: 16-bit integer type
374
INT32: 32-bit integer type
375
INT64: 64-bit integer type
376
FLOAT: 32-bit float type
377
DOUBLE: 64-bit float type
378
STRING: String type
379
BINARY: Binary type
380
TIMESTAMP: Timestamp type
381
DATE: Date type
382
"""
383
```
384
385
### Enumerations and Constants
386
387
Enumeration classes and constants used throughout the SDK for type safety and consistency.
388
389
```python { .api }
390
class PublicAccess:
391
"""
392
Public access levels for file systems.
393
394
Attributes:
395
OFF: No public access
396
FileSystem: Public read access to file system and paths
397
Path: Public read access to paths only
398
"""
399
OFF = "off"
400
FileSystem = "container"
401
Path = "blob"
402
403
class LocationMode:
404
"""
405
Location modes for geo-redundant storage accounts.
406
407
Attributes:
408
PRIMARY: Primary location
409
SECONDARY: Secondary location
410
"""
411
PRIMARY = "primary"
412
SECONDARY = "secondary"
413
414
class ResourceTypes:
415
"""
416
Resource types for account SAS permissions.
417
418
Attributes:
419
service (bool): Service-level resources
420
container (bool): Container-level resources
421
object (bool): Object-level resources
422
"""
423
424
def __init__(
425
self,
426
service: bool = False,
427
container: bool = False,
428
object: bool = False
429
):
430
"""Initialize resource types."""
431
432
class Services:
433
"""
434
Storage services for account SAS permissions.
435
436
Attributes:
437
blob (bool): Blob service
438
queue (bool): Queue service
439
table (bool): Table service
440
file (bool): File service
441
"""
442
443
def __init__(
444
self,
445
blob: bool = False,
446
queue: bool = False,
447
table: bool = False,
448
file: bool = False
449
):
450
"""Initialize services."""
451
452
class StorageErrorCode:
453
"""
454
Standard error codes returned by Azure Storage services.
455
456
Common error codes include:
457
ACCOUNT_NOT_FOUND: Storage account not found
458
AUTHENTICATION_FAILED: Authentication failure
459
AUTHORIZATION_FAILED: Authorization failure
460
BLOB_NOT_FOUND: Blob/file not found
461
CONTAINER_NOT_FOUND: Container/file system not found
462
INVALID_URI: Invalid request URI
463
PATH_NOT_FOUND: Path not found
464
RESOURCE_NOT_FOUND: Resource not found
465
LEASE_ID_MISMATCH: Lease ID mismatch
466
LEASE_ALREADY_PRESENT: Lease already exists
467
"""
468
469
VERSION = "12.21.0"
470
"""The version string of the azure-storage-file-datalake package."""
471
```
472
473
### Retry Policies
474
475
Retry policy classes for handling transient failures and implementing resilient operations.
476
477
```python { .api }
478
class ExponentialRetry:
479
"""
480
Exponential backoff retry policy for Azure Storage operations.
481
482
Implements exponential backoff with jitter for handling transient failures.
483
The delay between retries increases exponentially with each attempt.
484
"""
485
486
def __init__(
487
self,
488
initial_backoff: int = 15,
489
increment_base: int = 3,
490
retry_total: int = 3,
491
retry_to_secondary: bool = False,
492
random_jitter_range: int = 3,
493
**kwargs
494
):
495
"""
496
Initialize exponential retry policy.
497
498
Args:
499
initial_backoff (int): Initial backoff interval in seconds
500
increment_base (int): Backoff increment base for exponential calculation
501
retry_total (int): Total number of retry attempts
502
retry_to_secondary (bool): Whether to retry to secondary location
503
random_jitter_range (int): Random jitter range in seconds
504
**kwargs: Additional configuration options
505
"""
506
507
class LinearRetry:
508
"""
509
Linear backoff retry policy for Azure Storage operations.
510
511
Implements linear backoff where the delay between retries increases
512
linearly with each attempt.
513
"""
514
515
def __init__(
516
self,
517
backoff: int = 15,
518
retry_total: int = 3,
519
retry_to_secondary: bool = False,
520
random_jitter_range: int = 3,
521
**kwargs
522
):
523
"""
524
Initialize linear retry policy.
525
526
Args:
527
backoff (int): Backoff interval in seconds between retries
528
retry_total (int): Total number of retry attempts
529
retry_to_secondary (bool): Whether to retry to secondary location
530
random_jitter_range (int): Random jitter range in seconds
531
**kwargs: Additional configuration options
532
"""
533
```
534
535
### Paging and Results
536
537
Classes for handling paginated results and query responses.
538
539
```python { .api }
540
class FileSystemPropertiesPaged:
541
"""
542
Paged result container for file system listings.
543
544
Provides iteration over FileSystemProperties objects with
545
automatic handling of result pagination.
546
"""
547
548
def __iter__(self) -> Iterator[FileSystemProperties]:
549
"""Iterate over file system properties."""
550
551
def by_page(self) -> Iterator[List[FileSystemProperties]]:
552
"""Iterate page by page."""
553
554
class ItemPaged:
555
"""
556
Generic paged result container for iterable collections.
557
558
Type Parameters:
559
T: Type of items in the collection (PathProperties, etc.)
560
"""
561
562
def __iter__(self) -> Iterator:
563
"""Iterate over items."""
564
565
def by_page(self) -> Iterator[List]:
566
"""Iterate page by page."""
567
568
class DataLakeFileQueryError:
569
"""
570
Error information from file query operations.
571
572
Attributes:
573
error (str): Error description
574
is_fatal (bool): Whether the error is fatal
575
description (str): Detailed error description
576
position (int): Position in the query where error occurred
577
"""
578
579
class DataLakeFileQueryReader:
580
"""
581
Reader for streaming query results from file query operations.
582
583
Provides methods to read query results as streams, similar to StorageStreamDownloader
584
but specifically for query operations.
585
"""
586
587
def readall(self) -> bytes:
588
"""
589
Read all query results.
590
591
Returns:
592
bytes: Complete query results
593
"""
594
595
def readinto(self, stream) -> int:
596
"""
597
Read query results into a stream.
598
599
Args:
600
stream: Target stream to write query results
601
602
Returns:
603
int: Number of bytes read
604
"""
605
```
606
607
**Usage Examples:**
608
609
```python
610
from azure.storage.filedatalake import (
611
DataLakeServiceClient,
612
ContentSettings,
613
PublicAccess,
614
DelimitedTextDialect,
615
CustomerProvidedEncryptionKey
616
)
617
618
# Create service client
619
service_client = DataLakeServiceClient(
620
account_url="https://mystorageaccount.dfs.core.windows.net",
621
credential="<account_key>"
622
)
623
624
# Create file system with custom properties
625
fs_client = service_client.create_file_system(
626
"analytics-data",
627
metadata={"department": "data-science", "project": "ml-pipeline"},
628
public_access=PublicAccess.OFF
629
)
630
631
# Upload file with content settings
632
file_client = fs_client.create_file("data/results.csv")
633
634
content_settings = ContentSettings(
635
content_type="text/csv",
636
content_encoding="utf-8",
637
cache_control="max-age=3600"
638
)
639
640
# Customer-provided encryption
641
cpk = CustomerProvidedEncryptionKey(
642
key_value="<base64_key>",
643
key_hash="<base64_hash>"
644
)
645
646
file_client.upload_data(
647
"col1,col2,col3\nval1,val2,val3",
648
content_settings=content_settings,
649
customer_provided_encryption_key=cpk,
650
metadata={"format": "csv", "version": "1.0"}
651
)
652
653
# Query CSV file with custom dialect
654
csv_dialect = DelimitedTextDialect(
655
delimiter=",",
656
quote_char='"',
657
has_header=True,
658
line_terminator="\n"
659
)
660
661
query_result = file_client.query_file(
662
"SELECT col1, col2 FROM BlobStorage WHERE col3 = 'val3'",
663
file_format=csv_dialect
664
)
665
666
# Process results
667
with query_result as stream:
668
data = stream.readall().decode()
669
print(f"Query results: {data}")
670
671
# List file systems with properties
672
for fs_props in service_client.list_file_systems(include_metadata=True):
673
print(f"File System: {fs_props.name}")
674
print(f" Last Modified: {fs_props.last_modified}")
675
print(f" Metadata: {fs_props.metadata}")
676
print(f" Public Access: {fs_props.public_access}")
677
```