0
# Directory Operations
1
2
Directory-specific operations for managing subdirectories, files, and access control lists within hierarchical structures. The DataLakeDirectoryClient provides comprehensive directory management capabilities including ACL operations and path manipulations.
3
4
## Capabilities
5
6
### DataLakeDirectoryClient
7
8
Client to interact with a specific directory, providing operations for managing directory contents, access control, and hierarchical operations. Inherits path-based operations from the underlying PathClient.
9
10
```python { .api }
11
class DataLakeDirectoryClient:
12
"""
13
A client to interact with a specific directory in Azure Data Lake Storage Gen2.
14
15
Attributes:
16
url (str): The full endpoint URL to the directory, including SAS token if used
17
primary_endpoint (str): The full primary endpoint URL
18
primary_hostname (str): The hostname of the primary endpoint
19
file_system_name (str): Name of the file system
20
path_name (str): Path to the directory
21
"""
22
23
def __init__(
24
self,
25
account_url: str,
26
file_system_name: str,
27
directory_name: str,
28
credential=None,
29
**kwargs
30
):
31
"""
32
Initialize the DataLakeDirectoryClient.
33
34
Args:
35
account_url (str): The URL to the DataLake storage account
36
file_system_name (str): Name of the file system
37
directory_name (str): Name/path of the directory
38
credential: Authentication credential
39
**kwargs: Additional client configuration options
40
"""
41
42
@classmethod
43
def from_connection_string(
44
cls,
45
conn_str: str,
46
file_system_name: str,
47
directory_name: str,
48
credential=None,
49
**kwargs
50
) -> 'DataLakeDirectoryClient':
51
"""
52
Create DataLakeDirectoryClient from connection string.
53
54
Args:
55
conn_str (str): Connection string for the storage account
56
file_system_name (str): Name of the file system
57
directory_name (str): Name/path of the directory
58
credential: Optional credential to override connection string auth
59
**kwargs: Additional client configuration options
60
61
Returns:
62
DataLakeDirectoryClient: The directory client instance
63
"""
64
```
65
66
**Usage Examples:**
67
68
```python
69
from azure.storage.filedatalake import DataLakeDirectoryClient
70
71
# Create client directly
72
directory_client = DataLakeDirectoryClient(
73
account_url="https://mystorageaccount.dfs.core.windows.net",
74
file_system_name="myfilesystem",
75
directory_name="data/analytics",
76
credential="<account_key>"
77
)
78
79
# From connection string
80
directory_client = DataLakeDirectoryClient.from_connection_string(
81
"DefaultEndpointsProtocol=https;AccountName=mystorageaccount;AccountKey=<key>",
82
file_system_name="myfilesystem",
83
directory_name="data/analytics"
84
)
85
```
86
87
### Directory Management
88
89
Core operations for creating, deleting, and managing the directory itself.
90
91
```python { .api }
92
def create_directory(self, **kwargs) -> Dict[str, Any]:
93
"""
94
Create the directory.
95
96
Args:
97
content_settings (ContentSettings, optional): Content settings for the directory
98
metadata (dict, optional): Metadata key-value pairs
99
permissions (str, optional): POSIX permissions in octal format
100
umask (str, optional): POSIX umask for permission calculation
101
**kwargs: Additional options including conditions and CPK
102
103
Returns:
104
dict: Directory creation response headers including etag and last_modified
105
"""
106
107
def delete_directory(self, **kwargs) -> None:
108
"""
109
Delete the directory.
110
111
Args:
112
recursive (bool): If True, delete directory and all its contents
113
**kwargs: Additional options including conditions
114
"""
115
116
def exists(self, **kwargs) -> bool:
117
"""
118
Check if the directory exists.
119
120
Args:
121
**kwargs: Additional options
122
123
Returns:
124
bool: True if directory exists, False otherwise
125
"""
126
127
def get_directory_properties(self, **kwargs) -> DirectoryProperties:
128
"""
129
Get directory properties and metadata.
130
131
Args:
132
**kwargs: Additional options including conditions and user principal names
133
134
Returns:
135
DirectoryProperties: Properties of the directory including metadata, etag, permissions
136
"""
137
138
def rename_directory(
139
self,
140
new_name: str,
141
**kwargs
142
) -> DataLakeDirectoryClient:
143
"""
144
Rename the directory.
145
146
Args:
147
new_name (str): New name/path for the directory
148
content_settings (ContentSettings, optional): Content settings for renamed directory
149
metadata (dict, optional): Metadata for renamed directory
150
**kwargs: Additional options including conditions
151
152
Returns:
153
DataLakeDirectoryClient: Client for the renamed directory
154
"""
155
```
156
157
### Subdirectory Operations
158
159
Operations for creating and managing subdirectories within the current directory.
160
161
```python { .api }
162
def create_sub_directory(
163
self,
164
sub_directory: Union[DirectoryProperties, str],
165
metadata: Dict[str, str] = None,
166
**kwargs
167
) -> DataLakeDirectoryClient:
168
"""
169
Create a subdirectory within the current directory.
170
171
Args:
172
sub_directory: Name of the subdirectory or DirectoryProperties object
173
metadata (dict, optional): Metadata key-value pairs
174
**kwargs: Additional options including permissions, umask, and conditions
175
176
Returns:
177
DataLakeDirectoryClient: Client for the created subdirectory
178
"""
179
180
def delete_sub_directory(
181
self,
182
sub_directory: Union[DirectoryProperties, str],
183
**kwargs
184
) -> DataLakeDirectoryClient:
185
"""
186
Delete a subdirectory from the current directory.
187
188
Args:
189
sub_directory: Name of the subdirectory or DirectoryProperties object
190
**kwargs: Additional options including recursive delete and conditions
191
192
Returns:
193
DataLakeDirectoryClient: Client for the deleted subdirectory
194
"""
195
196
def get_sub_directory_client(
197
self,
198
sub_directory: Union[DirectoryProperties, str]
199
) -> DataLakeDirectoryClient:
200
"""
201
Get a DataLakeDirectoryClient for a subdirectory.
202
203
Args:
204
sub_directory: Name of the subdirectory or DirectoryProperties object
205
206
Returns:
207
DataLakeDirectoryClient: Client for the specified subdirectory
208
"""
209
```
210
211
### File Operations
212
213
Operations for creating and managing files within the directory.
214
215
```python { .api }
216
def create_file(
217
self,
218
file: Union[FileProperties, str],
219
**kwargs
220
) -> DataLakeFileClient:
221
"""
222
Create a file in the directory.
223
224
Args:
225
file: Name of the file or FileProperties object
226
content_settings (ContentSettings, optional): Content settings for the file
227
metadata (dict, optional): Metadata key-value pairs
228
**kwargs: Additional options including permissions, umask, and conditions
229
230
Returns:
231
DataLakeFileClient: Client for the created file
232
"""
233
234
def get_file_client(
235
self,
236
file: Union[FileProperties, str]
237
) -> DataLakeFileClient:
238
"""
239
Get a DataLakeFileClient for a file within the directory.
240
241
Args:
242
file: Name of the file or FileProperties object
243
244
Returns:
245
DataLakeFileClient: Client for the specified file
246
"""
247
```
248
249
### Path Listing
250
251
Operations for listing contents within the directory hierarchy.
252
253
```python { .api }
254
def get_paths(
255
self,
256
recursive: bool = True,
257
max_results: int = None,
258
**kwargs
259
) -> ItemPaged[PathProperties]:
260
"""
261
List paths within the directory.
262
263
Args:
264
recursive (bool): Whether to list recursively through subdirectories
265
max_results (int, optional): Maximum number of results per page
266
**kwargs: Additional options including upn (user principal names)
267
268
Returns:
269
ItemPaged[PathProperties]: Paged list of path properties within the directory
270
"""
271
```
272
273
### Access Control Management
274
275
Operations for managing POSIX-style access control lists (ACLs) and permissions.
276
277
```python { .api }
278
def get_access_control(self, **kwargs) -> Dict[str, Any]:
279
"""
280
Get access control properties for the directory.
281
282
Args:
283
upn (bool, optional): Return user principal names instead of object IDs
284
**kwargs: Additional options including conditions
285
286
Returns:
287
dict: Access control information including ACL, group, owner, permissions
288
"""
289
290
def set_access_control(
291
self,
292
owner: str = None,
293
group: str = None,
294
permissions: str = None,
295
acl: str = None,
296
**kwargs
297
) -> Dict[str, Any]:
298
"""
299
Set access control properties for the directory.
300
301
Args:
302
owner (str, optional): Owner user ID or principal name
303
group (str, optional): Owning group ID or principal name
304
permissions (str, optional): POSIX permissions in octal format
305
acl (str, optional): Access control list in POSIX format
306
**kwargs: Additional options including conditions
307
308
Returns:
309
dict: Response headers including etag and last_modified
310
"""
311
312
def set_access_control_recursive(
313
self,
314
acl: str,
315
**kwargs
316
) -> AccessControlChangeResult:
317
"""
318
Set access control recursively on the directory and its contents.
319
320
Args:
321
acl (str): Access control list in POSIX format
322
batch_size (int, optional): Number of paths to process per batch
323
max_batches (int, optional): Maximum number of batches to process
324
continue_on_failure (bool, optional): Continue processing on individual failures
325
**kwargs: Additional options
326
327
Returns:
328
AccessControlChangeResult: Result including counters and failure information
329
"""
330
331
def update_access_control_recursive(
332
self,
333
acl: str,
334
**kwargs
335
) -> AccessControlChangeResult:
336
"""
337
Update access control recursively on the directory and its contents.
338
339
Args:
340
acl (str): Access control list in POSIX format
341
batch_size (int, optional): Number of paths to process per batch
342
max_batches (int, optional): Maximum number of batches to process
343
continue_on_failure (bool, optional): Continue processing on individual failures
344
**kwargs: Additional options
345
346
Returns:
347
AccessControlChangeResult: Result including counters and failure information
348
"""
349
350
def remove_access_control_recursive(
351
self,
352
acl: str,
353
**kwargs
354
) -> AccessControlChangeResult:
355
"""
356
Remove access control recursively from the directory and its contents.
357
358
Args:
359
acl (str): Access control list entries to remove in POSIX format
360
batch_size (int, optional): Number of paths to process per batch
361
max_batches (int, optional): Maximum number of batches to process
362
continue_on_failure (bool, optional): Continue processing on individual failures
363
**kwargs: Additional options
364
365
Returns:
366
AccessControlChangeResult: Result including counters and failure information
367
"""
368
```
369
370
**Usage Examples:**
371
372
```python
373
from azure.storage.filedatalake import DataLakeDirectoryClient
374
375
# Create a directory client
376
directory_client = DataLakeDirectoryClient(
377
account_url="https://mystorageaccount.dfs.core.windows.net",
378
file_system_name="myfilesystem",
379
directory_name="data/analytics",
380
credential="<account_key>"
381
)
382
383
# Create the directory with metadata and permissions
384
directory_client.create_directory(
385
metadata={"purpose": "analytics", "team": "data-science"},
386
permissions="0755" # rwxr-xr-x
387
)
388
389
# Create subdirectories
390
raw_dir = directory_client.create_sub_directory("raw")
391
processed_dir = directory_client.create_sub_directory("processed")
392
393
# List all paths in the directory
394
paths = directory_client.get_paths(recursive=True)
395
for path in paths:
396
print(f"Path: {path.name}, Size: {path.content_length if not path.is_directory else 'N/A'}")
397
398
# Set access control with POSIX ACLs
399
directory_client.set_access_control(
400
owner="user1",
401
group="datagroup",
402
permissions="0755",
403
acl="user::rwx,group::r-x,other::r-x,user:analyst1:rwx"
404
)
405
406
# Apply ACLs recursively to all contents
407
acl_result = directory_client.set_access_control_recursive(
408
acl="user::rwx,group::r-x,other::r-x,user:analyst1:rwx",
409
continue_on_failure=True
410
)
411
print(f"ACL changes: {acl_result.counters.directories_successful} directories, "
412
f"{acl_result.counters.files_successful} files")
413
414
# Rename the directory
415
new_directory_client = directory_client.rename_directory("data/analytics-v2")
416
```