or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

caching.mdcallbacks.mdcompression.mdcore-operations.mdfilesystem-interface.mdindex.mdmapping.mdregistry.mdutilities.md

core-operations.mddocs/

0

# Core File Operations

1

2

Essential file and directory operations that provide the primary interface for interacting with files across all supported storage backends. These functions handle URL parsing, protocol resolution, and file opening with support for compression, encoding, and various access patterns.

3

4

## Capabilities

5

6

### File Opening

7

8

Opens single files with automatic protocol detection, compression handling, and encoding support. Returns a file-like object that can be used with context managers.

9

10

```python { .api }

11

def open(urlpath, mode='rb', compression=None, encoding='utf8', errors=None, protocol=None, newline=None, expand=None, **kwargs):

12

"""

13

Open a file for reading or writing.

14

15

Parameters:

16

- urlpath: str, URL or path to file (supports all registered protocols)

17

- mode: str, file opening mode ('r', 'w', 'a', 'rb', 'wb', etc.)

18

- compression: str or None, compression format ('gzip', 'bz2', 'lzma', etc.)

19

- encoding: str, text encoding for text mode (default 'utf8')

20

- errors: str or None, error handling mode for text encoding

21

- protocol: str or None, force specific protocol

22

- newline: str or None, newline handling for text mode

23

- expand: bool or None, expand glob patterns in paths

24

- **kwargs: additional options passed to filesystem

25

26

Returns:

27

OpenFile object (context manager)

28

"""

29

```

30

31

Usage example:

32

```python

33

# Open remote file with compression

34

with fsspec.open('s3://bucket/data.txt.gz', 'rt', compression='gzip') as f:

35

content = f.read()

36

37

# Open local file

38

with fsspec.open('/path/to/file.json', 'r') as f:

39

data = json.load(f)

40

```

41

42

### Multiple File Opening

43

44

Opens multiple files simultaneously, supporting glob patterns and parallel access. Useful for batch processing of file collections.

45

46

```python { .api }

47

def open_files(urlpath, mode='rb', compression=None, encoding='utf8', errors=None, name_function=None, num=1, protocol=None, newline=None, auto_mkdir=True, expand=True, **kwargs):

48

"""

49

Open multiple files for reading or writing.

50

51

Parameters:

52

- urlpath: str or list, URL pattern or list of URLs

53

- mode: str, file opening mode

54

- compression: str or None, compression format

55

- encoding: str, text encoding for text mode

56

- errors: str or None, error handling mode for text encoding

57

- name_function: callable, function to generate filenames for num > 1

58

- num: int, number of files to create for write operations

59

- protocol: str or None, force specific protocol

60

- newline: str or None, newline handling for text mode

61

- auto_mkdir: bool, automatically create parent directories

62

- expand: bool, expand glob patterns in paths

63

- **kwargs: additional options passed to filesystem

64

65

Returns:

66

List of OpenFile objects

67

"""

68

```

69

70

Usage example:

71

```python

72

# Open multiple files matching pattern

73

files = fsspec.open_files('s3://bucket/data/*.csv', 'rt')

74

for f in files:

75

with f as file:

76

df = pd.read_csv(file)

77

78

# Create multiple output files

79

outputs = fsspec.open_files('output-*.json', 'w', num=4)

80

```

81

82

### Local File Access

83

84

Ensures files are available locally, downloading remote files to temporary locations if necessary. Returns the local path for direct access.

85

86

```python { .api }

87

def open_local(url, mode='rb', **kwargs):

88

"""

89

Open a file ensuring it's available locally.

90

91

Parameters:

92

- url: str, URL or path to file

93

- mode: str, file opening mode

94

- **kwargs: additional options passed to filesystem

95

96

Returns:

97

str, local file path

98

"""

99

```

100

101

Usage example:

102

```python

103

# Ensure remote file is available locally

104

local_path = fsspec.open_local('s3://bucket/model.pkl')

105

with open(local_path, 'rb') as f:

106

model = pickle.load(f)

107

```

108

109

### URL to Filesystem Resolution

110

111

Parses URLs to extract the appropriate filesystem instance and normalized path. Core function for protocol resolution and filesystem instantiation.

112

113

```python { .api }

114

def url_to_fs(url, **kwargs):

115

"""

116

Parse URL and return filesystem instance and path.

117

118

Parameters:

119

- url: str, URL to parse

120

- **kwargs: storage options passed to filesystem constructor

121

122

Returns:

123

tuple: (AbstractFileSystem instance, str path)

124

"""

125

```

126

127

Usage example:

128

```python

129

# Parse S3 URL

130

fs, path = fsspec.url_to_fs('s3://bucket/path/file.txt', key='...', secret='...')

131

files = fs.ls(path.rsplit('/', 1)[0]) # List directory

132

133

# Parse HTTP URL

134

fs, path = fsspec.url_to_fs('https://example.com/data.csv')

135

content = fs.cat_file(path)

136

```

137

138

### Multiple URL Processing

139

140

Processes multiple URLs and paths, returning a single filesystem instance and list of paths. Optimizes for cases where multiple files share the same storage backend.

141

142

```python { .api }

143

def get_fs_token_paths(urls, mode='rb', num=1, name_function=None, **kwargs):

144

"""

145

Parse multiple URLs and return filesystem with paths.

146

147

Parameters:

148

- urls: str or list, URLs or paths to process

149

- mode: str, file opening mode

150

- num: int, number of files to create for write operations

151

- name_function: callable, function to generate filenames

152

- **kwargs: storage options passed to filesystem constructor

153

154

Returns:

155

tuple: (AbstractFileSystem instance, str token, list of paths)

156

"""

157

```

158

159

Usage example:

160

```python

161

# Process multiple S3 files

162

fs, token, paths = fsspec.get_fs_token_paths([

163

's3://bucket/file1.txt',

164

's3://bucket/file2.txt'

165

], key='...', secret='...')

166

167

# Read all files

168

contents = [fs.cat_file(path) for path in paths]

169

```

170

171

## Usage Patterns

172

173

### Context Manager Pattern

174

175

The preferred way to work with fsspec files:

176

177

```python

178

with fsspec.open('protocol://path/file.ext', 'r') as f:

179

data = f.read()

180

```

181

182

### Batch Processing

183

184

Processing multiple files efficiently:

185

186

```python

187

files = fsspec.open_files('s3://bucket/data/*.parquet')

188

datasets = []

189

for f in files:

190

with f as file:

191

datasets.append(pd.read_parquet(file))

192

```

193

194

### Protocol Auto-Detection

195

196

fsspec automatically detects protocols from URLs:

197

198

```python

199

# These all work with the same interface

200

fsspec.open('file:///local/path.txt') # Local filesystem

201

fsspec.open('/local/path.txt') # Local filesystem (implicit)

202

fsspec.open('s3://bucket/file.txt') # Amazon S3

203

fsspec.open('gcs://bucket/file.txt') # Google Cloud Storage

204

fsspec.open('https://example.com/api') # HTTP

205

```

206

207

### Compression Handling

208

209

Automatic compression based on file extensions or explicit specification:

210

211

```python

212

# Auto-detect compression from extension

213

with fsspec.open('data.csv.gz', 'rt') as f:

214

content = f.read()

215

216

# Explicit compression

217

with fsspec.open('data.csv', 'rt', compression='gzip') as f:

218

content = f.read()

219

```