Unified pythonic interface for diverse file systems and storage backends
—
Plugin system for registering, discovering, and instantiating filesystem implementations. The registry enables dynamic loading of storage backend drivers and provides centralized access to available protocols through a consistent interface.
Creates filesystem instances by protocol name with storage-specific options. The primary way to get a filesystem object for direct manipulation.
def filesystem(protocol, **storage_options):
"""
Create a filesystem instance for the given protocol.
Parameters:
- protocol: str, protocol name ('s3', 'gcs', 'local', 'http', etc.)
- **storage_options: keyword arguments passed to filesystem constructor
Returns:
AbstractFileSystem instance
"""Usage example:
# Create S3 filesystem
s3 = fsspec.filesystem('s3', key='ACCESS_KEY', secret='SECRET_KEY')
files = s3.ls('bucket-name/')
# Create local filesystem
local = fsspec.filesystem('file')
local.mkdir('/tmp/new_directory')
# Create HTTP filesystem
http = fsspec.filesystem('http')
content = http.cat('https://example.com/data.json')Retrieves the filesystem class without instantiating it. Useful for inspection, subclassing, or custom instantiation patterns.
def get_filesystem_class(protocol):
"""
Get the filesystem class for a protocol.
Parameters:
- protocol: str, protocol name
Returns:
type, AbstractFileSystem subclass
"""Usage example:
# Get S3 filesystem class
S3FileSystem = fsspec.get_filesystem_class('s3')
# Check available methods
print(dir(S3FileSystem))
# Custom instantiation
s3 = S3FileSystem(key='...', secret='...', client_kwargs={'region_name': 'us-west-2'})Registers new filesystem implementations, enabling plugin-style extensions. Allows third-party packages to integrate with fsspec's unified interface.
def register_implementation(name, cls, clobber=False, errtxt=None):
"""
Register a filesystem implementation.
Parameters:
- name: str, protocol name
- cls: str or type, filesystem class or import path
- clobber: bool, whether to overwrite existing registrations
- errtxt: str, error message for import failures
"""Usage example:
# Register a custom filesystem
class MyCustomFS(fsspec.AbstractFileSystem):
protocol = 'custom'
def _open(self, path, mode='rb', **kwargs):
# Custom implementation
pass
fsspec.register_implementation('custom', MyCustomFS)
# Register by import path
fsspec.register_implementation(
'myprotocol',
'mypackage.MyFileSystem',
errtxt='Please install mypackage for myprotocol support'
)Lists all registered protocols, including both built-in and third-party implementations. Useful for discovering available storage backends.
def available_protocols():
"""
List all available protocol names.
Returns:
list of str, protocol names
"""Usage example:
# See all available protocols
protocols = fsspec.available_protocols()
print(protocols)
# ['file', 'local', 's3', 'gcs', 'http', 'https', 'ftp', 'sftp', ...]
# Check if a protocol is available
if 's3' in fsspec.available_protocols():
s3 = fsspec.filesystem('s3')registry: dict
"""Read-only mapping of protocol names to filesystem classes"""
known_implementations: dict
"""Mapping of protocol names to import specifications"""
default: str
"""Default protocol name ('file')"""Usage example:
# Inspect registry
print(fsspec.registry.keys())
# Check if protocol is known but not loaded
if 's3' in fsspec.known_implementations:
# Will trigger import and registration
s3_fs = fsspec.filesystem('s3')fsspec uses lazy loading for optional dependencies:
# These will only import the required package when first used
s3 = fsspec.filesystem('s3') # Imports s3fs
gcs = fsspec.filesystem('gcs') # Imports gcsfsCreating and registering custom filesystems:
import fsspec
from fsspec.spec import AbstractFileSystem
class DatabaseFS(AbstractFileSystem):
protocol = 'db'
def __init__(self, connection_string, **kwargs):
super().__init__(**kwargs)
self.connection_string = connection_string
def _open(self, path, mode='rb', **kwargs):
# Implement database table/query access
pass
def ls(self, path, detail=True, **kwargs):
# List tables/views
pass
# Register the custom filesystem
fsspec.register_implementation('db', DatabaseFS)
# Use it like any other filesystem
db = fsspec.filesystem('db', connection_string='postgresql://...')Checking available protocols at runtime:
def get_cloud_protocols():
"""Get all available cloud storage protocols"""
all_protocols = fsspec.available_protocols()
cloud_protocols = [p for p in all_protocols
if p in ['s3', 'gcs', 'gs', 'az', 'abfs', 'adl', 'oci']]
return cloud_protocolsInstall with Tessl CLI
npx tessl i tessl/pypi-fsspec