Microsoft Azure File DataLake Storage Client Library for Python
npx @tessl/cli install tessl/pypi-azure-storage-file-datalake@12.21.00
# Azure Storage File DataLake
1
2
Azure Storage File DataLake provides a comprehensive Python client library for Azure Data Lake Storage Gen2, enabling developers to interact with hierarchical namespace-enabled storage accounts. It offers atomic directory operations (create, rename, delete), fine-grained access control management (ACLs), and seamless integration with the broader Azure ecosystem through support for various authentication methods including SAS tokens, shared access keys, and Azure Identity credentials.
3
4
## Package Information
5
6
- **Package Name**: azure-storage-file-datalake
7
- **Language**: Python
8
- **Installation**: `pip install azure-storage-file-datalake`
9
10
## Core Imports
11
12
```python
13
from azure.storage.filedatalake import (
14
DataLakeServiceClient,
15
FileSystemClient,
16
DataLakeDirectoryClient,
17
DataLakeFileClient,
18
DataLakeLeaseClient
19
)
20
```
21
22
For async operations:
23
24
```python
25
from azure.storage.filedatalake.aio import (
26
DataLakeServiceClient,
27
FileSystemClient,
28
DataLakeDirectoryClient,
29
DataLakeFileClient,
30
DataLakeLeaseClient
31
)
32
```
33
34
## Basic Usage
35
36
```python
37
from azure.storage.filedatalake import DataLakeServiceClient
38
39
# Initialize service client
40
service_client = DataLakeServiceClient(
41
account_url="https://mystorageaccount.dfs.core.windows.net",
42
credential="<account_key>"
43
)
44
45
# Create a file system
46
file_system_client = service_client.create_file_system("myfilesystem")
47
48
# Create a directory
49
directory_client = file_system_client.create_directory("mydirectory")
50
51
# Upload a file
52
file_client = directory_client.create_file("myfile.txt")
53
file_client.upload_data("Hello, Data Lake!", overwrite=True)
54
55
# Download the file
56
download = file_client.download_file()
57
content = download.readall()
58
print(content.decode())
59
```
60
61
## Architecture
62
63
The Azure Storage File DataLake SDK follows a hierarchical client architecture:
64
65
- **DataLakeServiceClient**: Top-level client for account-wide operations (creating file systems, listing containers, managing service properties)
66
- **FileSystemClient**: File system-level operations (creating directories/files, managing access policies, listing paths)
67
- **DataLakeDirectoryClient**: Directory-specific operations (creating subdirectories/files, managing ACLs, renaming)
68
- **DataLakeFileClient**: File-specific operations (upload/download data, append operations, managing file properties)
69
- **DataLakeLeaseClient**: Lease management for exclusive access control
70
- **StorageStreamDownloader**: Streaming download operations for large files
71
72
This design enables fine-grained control over data lake resources while maintaining a logical progression from account → file system → directory → file operations.
73
74
## Capabilities
75
76
### Service Operations
77
78
Account-level operations for managing file systems, user delegation keys, and service properties. Provides the entry point for accessing Data Lake Storage Gen2 resources.
79
80
```python { .api }
81
class DataLakeServiceClient:
82
def __init__(self, account_url: str, credential=None, **kwargs): ...
83
def create_file_system(self, file_system: str, **kwargs) -> FileSystemClient: ...
84
def list_file_systems(self, **kwargs) -> ItemPaged[FileSystemProperties]: ...
85
def get_file_system_client(self, file_system: str) -> FileSystemClient: ...
86
```
87
88
[Service Operations](./service-operations.md)
89
90
### File System Operations
91
92
File system-level operations for managing directories, files, and access policies within a specific container.
93
94
```python { .api }
95
class FileSystemClient:
96
def __init__(self, account_url: str, file_system_name: str, credential=None, **kwargs): ...
97
def create_directory(self, directory: str, **kwargs) -> DataLakeDirectoryClient: ...
98
def create_file(self, file: str, **kwargs) -> DataLakeFileClient: ...
99
def get_paths(self, **kwargs) -> ItemPaged[PathProperties]: ...
100
```
101
102
[File System Operations](./file-system-operations.md)
103
104
### Directory Operations
105
106
Directory-specific operations for managing subdirectories, files, and access control lists within hierarchical structures.
107
108
```python { .api }
109
class DataLakeDirectoryClient:
110
def __init__(self, account_url: str, file_system_name: str, directory_name: str, credential=None, **kwargs): ...
111
def create_sub_directory(self, sub_directory: str, **kwargs) -> DataLakeDirectoryClient: ...
112
def create_file(self, file: str, **kwargs) -> DataLakeFileClient: ...
113
def rename_directory(self, new_name: str, **kwargs) -> DataLakeDirectoryClient: ...
114
```
115
116
[Directory Operations](./directory-operations.md)
117
118
### File Operations
119
120
File-specific operations for uploading, downloading, appending data, and managing file properties and metadata.
121
122
```python { .api }
123
class DataLakeFileClient:
124
def __init__(self, account_url: str, file_system_name: str, file_path: str, credential=None, **kwargs): ...
125
def upload_data(self, data, **kwargs) -> Dict[str, Any]: ...
126
def download_file(self, **kwargs) -> StorageStreamDownloader: ...
127
def append_data(self, data, offset: int, **kwargs) -> Dict[str, Any]: ...
128
```
129
130
[File Operations](./file-operations.md)
131
132
### Access Control and Security
133
134
Comprehensive access control management including POSIX-style ACLs, SAS token generation, and lease-based concurrency control.
135
136
```python { .api }
137
def generate_file_system_sas(account_name: str, file_system_name: str, account_key: str, **kwargs) -> str: ...
138
def generate_directory_sas(account_name: str, file_system_name: str, directory_name: str, account_key: str, **kwargs) -> str: ...
139
def generate_file_sas(account_name: str, file_system_name: str, file_path: str, account_key: str, **kwargs) -> str: ...
140
141
class DataLakeLeaseClient:
142
def acquire(self, **kwargs) -> None: ...
143
def renew(self, **kwargs) -> None: ...
144
def release(self, **kwargs) -> None: ...
145
```
146
147
[Access Control and Security](./access-control-security.md)
148
149
### Models and Types
150
151
Core data models, properties, permissions, and configuration classes used throughout the SDK.
152
153
```python { .api }
154
class FileSystemProperties:
155
name: str
156
last_modified: datetime
157
etag: str
158
metadata: Dict[str, str]
159
160
class DirectoryProperties:
161
name: str
162
last_modified: datetime
163
etag: str
164
permissions: str
165
166
class FileProperties:
167
name: str
168
size: int
169
last_modified: datetime
170
etag: str
171
content_settings: ContentSettings
172
```
173
174
[Models and Types](./models-types.md)