or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

caching.mdcallbacks.mdcompression.mdcore-operations.mdfilesystem-interface.mdindex.mdmapping.mdregistry.mdutilities.md

registry.mddocs/

0

# Filesystem Registry

1

2

Plugin system for registering, discovering, and instantiating filesystem implementations. The registry enables dynamic loading of storage backend drivers and provides centralized access to available protocols through a consistent interface.

3

4

## Capabilities

5

6

### Filesystem Instantiation

7

8

Creates filesystem instances by protocol name with storage-specific options. The primary way to get a filesystem object for direct manipulation.

9

10

```python { .api }

11

def filesystem(protocol, **storage_options):

12

"""

13

Create a filesystem instance for the given protocol.

14

15

Parameters:

16

- protocol: str, protocol name ('s3', 'gcs', 'local', 'http', etc.)

17

- **storage_options: keyword arguments passed to filesystem constructor

18

19

Returns:

20

AbstractFileSystem instance

21

"""

22

```

23

24

Usage example:

25

```python

26

# Create S3 filesystem

27

s3 = fsspec.filesystem('s3', key='ACCESS_KEY', secret='SECRET_KEY')

28

files = s3.ls('bucket-name/')

29

30

# Create local filesystem

31

local = fsspec.filesystem('file')

32

local.mkdir('/tmp/new_directory')

33

34

# Create HTTP filesystem

35

http = fsspec.filesystem('http')

36

content = http.cat('https://example.com/data.json')

37

```

38

39

### Filesystem Class Resolution

40

41

Retrieves the filesystem class without instantiating it. Useful for inspection, subclassing, or custom instantiation patterns.

42

43

```python { .api }

44

def get_filesystem_class(protocol):

45

"""

46

Get the filesystem class for a protocol.

47

48

Parameters:

49

- protocol: str, protocol name

50

51

Returns:

52

type, AbstractFileSystem subclass

53

"""

54

```

55

56

Usage example:

57

```python

58

# Get S3 filesystem class

59

S3FileSystem = fsspec.get_filesystem_class('s3')

60

61

# Check available methods

62

print(dir(S3FileSystem))

63

64

# Custom instantiation

65

s3 = S3FileSystem(key='...', secret='...', client_kwargs={'region_name': 'us-west-2'})

66

```

67

68

### Protocol Registration

69

70

Registers new filesystem implementations, enabling plugin-style extensions. Allows third-party packages to integrate with fsspec's unified interface.

71

72

```python { .api }

73

def register_implementation(name, cls, clobber=False, errtxt=None):

74

"""

75

Register a filesystem implementation.

76

77

Parameters:

78

- name: str, protocol name

79

- cls: str or type, filesystem class or import path

80

- clobber: bool, whether to overwrite existing registrations

81

- errtxt: str, error message for import failures

82

"""

83

```

84

85

Usage example:

86

```python

87

# Register a custom filesystem

88

class MyCustomFS(fsspec.AbstractFileSystem):

89

protocol = 'custom'

90

91

def _open(self, path, mode='rb', **kwargs):

92

# Custom implementation

93

pass

94

95

fsspec.register_implementation('custom', MyCustomFS)

96

97

# Register by import path

98

fsspec.register_implementation(

99

'myprotocol',

100

'mypackage.MyFileSystem',

101

errtxt='Please install mypackage for myprotocol support'

102

)

103

```

104

105

### Available Protocols

106

107

Lists all registered protocols, including both built-in and third-party implementations. Useful for discovering available storage backends.

108

109

```python { .api }

110

def available_protocols():

111

"""

112

List all available protocol names.

113

114

Returns:

115

list of str, protocol names

116

"""

117

```

118

119

Usage example:

120

```python

121

# See all available protocols

122

protocols = fsspec.available_protocols()

123

print(protocols)

124

# ['file', 'local', 's3', 'gcs', 'http', 'https', 'ftp', 'sftp', ...]

125

126

# Check if a protocol is available

127

if 's3' in fsspec.available_protocols():

128

s3 = fsspec.filesystem('s3')

129

```

130

131

## Built-in Protocol Implementations

132

133

### Local Filesystems

134

- **file**, **local**: Local filesystem access

135

- **memory**: In-memory filesystem for testing

136

137

### Cloud Storage

138

- **s3**: Amazon S3 (requires s3fs)

139

- **gcs**, **gs**: Google Cloud Storage (requires gcsfs)

140

- **az**, **abfs**: Azure Blob Storage (requires adlfs)

141

- **adl**: Azure Data Lake Gen1 (requires adlfs)

142

- **oci**: Oracle Cloud Infrastructure (requires ocifs)

143

144

### Network Protocols

145

- **http**, **https**: HTTP/HTTPS access

146

- **ftp**: FTP protocol

147

- **sftp**, **ssh**: SSH/SFTP access (requires paramiko)

148

- **smb**: SMB/CIFS network shares (requires smbprotocol)

149

- **webdav**: WebDAV protocol

150

151

### Archive Formats

152

- **zip**: ZIP archive access

153

- **tar**: TAR archive access

154

- **libarchive**: Multiple archive formats (requires libarchive-c)

155

156

### Specialized

157

- **cached**: Caching wrapper for other filesystems

158

- **reference**: Reference filesystem for Zarr/Kerchunk

159

- **dask**: Integration with Dask distributed computing

160

- **git**: Git repository access (requires pygit2)

161

- **github**: GitHub repository access (requires requests)

162

- **jupyter**: Jupyter server filesystem access

163

164

### Database/Analytics

165

- **hdfs**, **webhdfs**: Hadoop Distributed File System (requires pyarrow)

166

- **arrow_hdfs**: Arrow-based HDFS access (requires pyarrow)

167

168

## Registry Constants and Variables

169

170

```python { .api }

171

registry: dict

172

"""Read-only mapping of protocol names to filesystem classes"""

173

174

known_implementations: dict

175

"""Mapping of protocol names to import specifications"""

176

177

default: str

178

"""Default protocol name ('file')"""

179

```

180

181

Usage example:

182

```python

183

# Inspect registry

184

print(fsspec.registry.keys())

185

186

# Check if protocol is known but not loaded

187

if 's3' in fsspec.known_implementations:

188

# Will trigger import and registration

189

s3_fs = fsspec.filesystem('s3')

190

```

191

192

## Usage Patterns

193

194

### Dynamic Protocol Loading

195

196

fsspec uses lazy loading for optional dependencies:

197

198

```python

199

# These will only import the required package when first used

200

s3 = fsspec.filesystem('s3') # Imports s3fs

201

gcs = fsspec.filesystem('gcs') # Imports gcsfs

202

```

203

204

### Custom Filesystem Integration

205

206

Creating and registering custom filesystems:

207

208

```python

209

import fsspec

210

from fsspec.spec import AbstractFileSystem

211

212

class DatabaseFS(AbstractFileSystem):

213

protocol = 'db'

214

215

def __init__(self, connection_string, **kwargs):

216

super().__init__(**kwargs)

217

self.connection_string = connection_string

218

219

def _open(self, path, mode='rb', **kwargs):

220

# Implement database table/query access

221

pass

222

223

def ls(self, path, detail=True, **kwargs):

224

# List tables/views

225

pass

226

227

# Register the custom filesystem

228

fsspec.register_implementation('db', DatabaseFS)

229

230

# Use it like any other filesystem

231

db = fsspec.filesystem('db', connection_string='postgresql://...')

232

```

233

234

### Protocol Discovery

235

236

Checking available protocols at runtime:

237

238

```python

239

def get_cloud_protocols():

240

"""Get all available cloud storage protocols"""

241

all_protocols = fsspec.available_protocols()

242

cloud_protocols = [p for p in all_protocols

243

if p in ['s3', 'gcs', 'gs', 'az', 'abfs', 'adl', 'oci']]

244

return cloud_protocols

245

```