0
# AbstractFileSystem Interface
1
2
Complete filesystem abstraction providing consistent methods for file operations, directory management, and metadata access across all storage backends. AbstractFileSystem serves as the base class for all filesystem implementations and defines the unified interface that enables storage-agnostic code.
3
4
## Capabilities
5
6
### File Opening and Access
7
8
Core file access methods that handle opening files and reading/writing data with support for various modes and options.
9
10
```python { .api }
11
def open(self, path, mode='rb', **kwargs):
12
"""
13
Open a file-like object.
14
15
Parameters:
16
- path: str, file path
17
- mode: str, opening mode ('r', 'w', 'a', 'rb', 'wb', etc.)
18
- **kwargs: additional options (block_size, cache_type, etc.)
19
20
Returns:
21
File-like object
22
"""
23
24
def cat_file(self, path, start=None, end=None, **kwargs):
25
"""
26
Read file contents as bytes.
27
28
Parameters:
29
- path: str, file path
30
- start: int, byte offset to start reading
31
- end: int, byte offset to stop reading
32
33
Returns:
34
bytes, file contents
35
"""
36
37
def pipe_file(self, path, value, **kwargs):
38
"""
39
Write bytes to a file.
40
41
Parameters:
42
- path: str, file path
43
- value: bytes, data to write
44
"""
45
46
def read_text(self, path, encoding=None, **kwargs):
47
"""
48
Read file contents as text.
49
50
Parameters:
51
- path: str, file path
52
- encoding: str, text encoding
53
54
Returns:
55
str, file contents
56
"""
57
58
def write_text(self, path, value, encoding=None, **kwargs):
59
"""
60
Write text to a file.
61
62
Parameters:
63
- path: str, file path
64
- value: str, text to write
65
- encoding: str, text encoding
66
"""
67
```
68
69
### Directory Operations
70
71
Methods for creating, listing, and managing directories across different storage backends.
72
73
```python { .api }
74
def ls(self, path, detail=True, **kwargs):
75
"""
76
List directory contents.
77
78
Parameters:
79
- path: str, directory path
80
- detail: bool, return detailed info or just names
81
82
Returns:
83
list, file/directory information
84
"""
85
86
def mkdir(self, path, create_parents=True, **kwargs):
87
"""
88
Create a directory.
89
90
Parameters:
91
- path: str, directory path
92
- create_parents: bool, create parent directories if needed
93
"""
94
95
def makedirs(self, path, exist_ok=False):
96
"""
97
Create directories recursively.
98
99
Parameters:
100
- path: str, directory path
101
- exist_ok: bool, don't raise error if directory exists
102
"""
103
104
def rmdir(self, path):
105
"""
106
Remove an empty directory.
107
108
Parameters:
109
- path: str, directory path
110
"""
111
```
112
113
### File and Directory Information
114
115
Methods for querying metadata, checking existence, and getting file properties.
116
117
```python { .api }
118
def exists(self, path, **kwargs):
119
"""
120
Check if path exists.
121
122
Parameters:
123
- path: str, file or directory path
124
125
Returns:
126
bool, True if path exists
127
"""
128
129
def isdir(self, path):
130
"""
131
Check if path is a directory.
132
133
Parameters:
134
- path: str, path to check
135
136
Returns:
137
bool, True if path is directory
138
"""
139
140
def isfile(self, path):
141
"""
142
Check if path is a file.
143
144
Parameters:
145
- path: str, path to check
146
147
Returns:
148
bool, True if path is file
149
"""
150
151
def info(self, path, **kwargs):
152
"""
153
Get detailed information about a path.
154
155
Parameters:
156
- path: str, file or directory path
157
158
Returns:
159
dict, file metadata (size, type, mtime, etc.)
160
"""
161
162
def size(self, path):
163
"""
164
Get file size in bytes.
165
166
Parameters:
167
- path: str, file path
168
169
Returns:
170
int, file size in bytes
171
"""
172
173
def checksum(self, path):
174
"""
175
Get file checksum/hash.
176
177
Parameters:
178
- path: str, file path
179
180
Returns:
181
str, file checksum
182
"""
183
184
def created(self, path):
185
"""
186
Get file creation time.
187
188
Parameters:
189
- path: str, file path
190
191
Returns:
192
datetime, creation time
193
"""
194
195
def modified(self, path):
196
"""
197
Get file modification time.
198
199
Parameters:
200
- path: str, file path
201
202
Returns:
203
datetime, modification time
204
"""
205
```
206
207
### Pattern Matching and Discovery
208
209
Methods for finding files using patterns, globbing, and walking directory trees.
210
211
```python { .api }
212
def find(self, path, maxdepth=None, withdirs=False, detail=False, **kwargs):
213
"""
214
Find files recursively.
215
216
Parameters:
217
- path: str, starting path
218
- maxdepth: int, maximum recursion depth
219
- withdirs: bool, include directories in results
220
- detail: bool, return detailed info or just paths
221
222
Returns:
223
list, found files/directories
224
"""
225
226
def glob(self, path, maxdepth=None, **kwargs):
227
"""
228
Find files matching glob pattern.
229
230
Parameters:
231
- path: str, glob pattern
232
- maxdepth: int, maximum recursion depth
233
234
Returns:
235
list, matching file paths
236
"""
237
238
def walk(self, path, maxdepth=None, topdown=True, **kwargs):
239
"""
240
Walk directory tree.
241
242
Parameters:
243
- path: str, starting directory
244
- maxdepth: int, maximum recursion depth
245
- topdown: bool, visit directories top-down or bottom-up
246
247
Returns:
248
generator, yields (dirpath, dirnames, filenames) tuples
249
"""
250
251
def du(self, path, total=True, maxdepth=None, **kwargs):
252
"""
253
Calculate disk usage.
254
255
Parameters:
256
- path: str, directory path
257
- total: bool, return total size or per-file breakdown
258
- maxdepth: int, maximum recursion depth
259
260
Returns:
261
int or dict, total size or size breakdown
262
"""
263
```
264
265
### File Transfer Operations
266
267
Methods for moving, copying, uploading, and downloading files between locations.
268
269
```python { .api }
270
def get_file(self, rpath, lpath, **kwargs):
271
"""
272
Download a single file to local filesystem.
273
274
Parameters:
275
- rpath: str, remote file path
276
- lpath: str, local file path
277
"""
278
279
def get(self, rpath, lpath, recursive=False, **kwargs):
280
"""
281
Download files/directories to local filesystem.
282
283
Parameters:
284
- rpath: str, remote path
285
- lpath: str, local path
286
- recursive: bool, download directories recursively
287
"""
288
289
def put_file(self, lpath, rpath, **kwargs):
290
"""
291
Upload a single file from local filesystem.
292
293
Parameters:
294
- lpath: str, local file path
295
- rpath: str, remote file path
296
"""
297
298
def put(self, lpath, rpath, recursive=False, **kwargs):
299
"""
300
Upload files/directories from local filesystem.
301
302
Parameters:
303
- lpath: str, local path
304
- rpath: str, remote path
305
- recursive: bool, upload directories recursively
306
"""
307
308
def copy(self, path1, path2, recursive=False, **kwargs):
309
"""
310
Copy files/directories within filesystem.
311
312
Parameters:
313
- path1: str, source path
314
- path2: str, destination path
315
- recursive: bool, copy directories recursively
316
"""
317
318
def mv(self, path1, path2, recursive=False, **kwargs):
319
"""
320
Move/rename files/directories.
321
322
Parameters:
323
- path1: str, source path
324
- path2: str, destination path
325
- recursive: bool, move directories recursively
326
"""
327
```
328
329
### File Removal
330
331
Methods for deleting files and directories with various options for handling recursive deletion.
332
333
```python { .api }
334
def rm_file(self, path):
335
"""
336
Remove a single file.
337
338
Parameters:
339
- path: str, file path
340
"""
341
342
def rm(self, path, recursive=False, maxdepth=None):
343
"""
344
Remove files/directories.
345
346
Parameters:
347
- path: str, path to remove
348
- recursive: bool, remove directories recursively
349
- maxdepth: int, maximum recursion depth
350
"""
351
```
352
353
### Bulk Operations
354
355
Methods for operating on multiple files efficiently with batching and parallel processing.
356
357
```python { .api }
358
def cat(self, path, recursive=False, **kwargs):
359
"""
360
Read multiple files.
361
362
Parameters:
363
- path: str or list, file path(s) or pattern
364
- recursive: bool, include files in subdirectories
365
366
Returns:
367
bytes or dict, file contents (single file) or mapping (multiple files)
368
"""
369
370
def pipe(self, path, value=None, **kwargs):
371
"""
372
Write to multiple files.
373
374
Parameters:
375
- path: str or dict, file path(s) or path->data mapping
376
- value: bytes, data to write (if path is str)
377
"""
378
379
def head(self, path, size=1024):
380
"""
381
Read beginning of file.
382
383
Parameters:
384
- path: str, file path
385
- size: int, number of bytes to read
386
387
Returns:
388
bytes, file head content
389
"""
390
391
def tail(self, path, size=1024):
392
"""
393
Read end of file.
394
395
Parameters:
396
- path: str, file path
397
- size: int, number of bytes to read
398
399
Returns:
400
bytes, file tail content
401
"""
402
403
def touch(self, path, truncate=True, **kwargs):
404
"""
405
Create empty file or update timestamp.
406
407
Parameters:
408
- path: str, file path
409
- truncate: bool, truncate file if it exists
410
"""
411
```
412
413
### Advanced Operations
414
415
Advanced filesystem operations including unique key generation, path expansion, and utility methods.
416
417
```python { .api }
418
def ukey(self, path):
419
"""
420
Generate unique key for file.
421
422
Parameters:
423
- path: str, file path
424
425
Returns:
426
str, unique key (typically includes size and mtime)
427
"""
428
429
def expand_path(self, path, recursive=False, **kwargs):
430
"""
431
Expand path patterns to actual paths.
432
433
Parameters:
434
- path: str, path pattern
435
- recursive: bool, expand recursively
436
437
Returns:
438
list, expanded paths
439
"""
440
```
441
442
### Transaction Support
443
444
Methods for managing filesystem transactions to ensure atomic operations across multiple files.
445
446
```python { .api }
447
def start_transaction(self):
448
"""
449
Start a filesystem transaction.
450
451
Returns:
452
Transaction, transaction context
453
"""
454
455
def end_transaction(self):
456
"""End the current transaction."""
457
```
458
459
### Caching and Performance
460
461
Methods for managing filesystem instance caching and performance optimization.
462
463
```python { .api }
464
def invalidate_cache(self, path=None):
465
"""
466
Clear filesystem cache.
467
468
Parameters:
469
- path: str, specific path to invalidate (None for all)
470
"""
471
472
@classmethod
473
def clear_instance_cache(cls):
474
"""Clear all cached filesystem instances."""
475
```
476
477
### Serialization and Mapping
478
479
Methods for serializing filesystem instances and creating dictionary-like interfaces.
480
481
```python { .api }
482
def get_mapper(self, root="", check=False, create=False):
483
"""
484
Get dictionary-like interface to filesystem.
485
486
Parameters:
487
- root: str, root path for mapping
488
- check: bool, check if root exists
489
- create: bool, create root if it doesn't exist
490
491
Returns:
492
FSMap, dictionary-like interface
493
"""
494
495
def to_json(self, include_password=True):
496
"""
497
Serialize filesystem to JSON.
498
499
Parameters:
500
- include_password: bool, include sensitive information
501
502
Returns:
503
str, JSON representation
504
"""
505
506
def to_dict(self, include_password=True):
507
"""
508
Serialize filesystem to dictionary.
509
510
Parameters:
511
- include_password: bool, include sensitive information
512
513
Returns:
514
dict, dictionary representation
515
"""
516
517
@classmethod
518
def from_json(cls, blob):
519
"""
520
Deserialize filesystem from JSON.
521
522
Parameters:
523
- blob: str, JSON representation
524
525
Returns:
526
AbstractFileSystem, deserialized instance
527
"""
528
529
@classmethod
530
def from_dict(cls, dct):
531
"""
532
Deserialize filesystem from dictionary.
533
534
Parameters:
535
- dct: dict, dictionary representation
536
537
Returns:
538
AbstractFileSystem, deserialized instance
539
"""
540
```
541
542
## Key Properties
543
544
```python { .api }
545
protocol: str or list
546
"""Protocol name(s) handled by this filesystem"""
547
548
sep: str
549
"""Path separator (default '/')"""
550
551
blocksize: int
552
"""Default block size for reading operations"""
553
554
cachable: bool
555
"""Whether filesystem instances should be cached"""
556
557
transaction: Transaction
558
"""Current transaction context (if any)"""
559
```
560
561
## Usage Patterns
562
563
### Direct Filesystem Usage
564
565
```python
566
# Get filesystem instance
567
fs = fsspec.filesystem('s3', key='...', secret='...')
568
569
# Use filesystem methods directly
570
files = fs.ls('bucket/path/')
571
content = fs.cat_file('bucket/path/file.txt')
572
fs.pipe_file('bucket/path/output.txt', b'data')
573
574
# File operations
575
fs.copy('bucket/source.txt', 'bucket/backup.txt')
576
fs.rm('bucket/old_file.txt')
577
```
578
579
### Context Manager Pattern
580
581
```python
582
# Start transaction for atomic operations
583
with fs.start_transaction():
584
fs.pipe_file('bucket/file1.txt', b'data1')
585
fs.pipe_file('bucket/file2.txt', b'data2')
586
# Both files committed together
587
```
588
589
### Subclassing for Custom Filesystems
590
591
```python
592
class MyFileSystem(fsspec.AbstractFileSystem):
593
protocol = 'myfs'
594
595
def _open(self, path, mode='rb', **kwargs):
596
# Implement file opening
597
pass
598
599
def ls(self, path, detail=True, **kwargs):
600
# Implement directory listing
601
pass
602
```