or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

customization.mddumping-serialization.mderror-handling.mdindex.mdloaders-dumpers.mdloading-parsing.mdsafe-operations.md

loaders-dumpers.mddocs/

0

# Loaders and Dumpers

1

2

Comprehensive set of loader and dumper classes providing different security levels and performance characteristics. Choose the appropriate loader/dumper based on your security requirements and performance needs.

3

4

## Capabilities

5

6

### Loader Classes

7

8

Different loader classes provide varying levels of security and functionality when parsing YAML content.

9

10

```python { .api }

11

class BaseLoader(Reader, Scanner, Parser, Composer, BaseConstructor, BaseResolver):

12

"""

13

Base loader with minimal functionality.

14

15

Provides basic YAML parsing without advanced type construction.

16

Only constructs basic Python types (str, int, float, bool, list, dict, None).

17

"""

18

19

class SafeLoader(Reader, Scanner, Parser, Composer, SafeConstructor, Resolver):

20

"""

21

Safe loader for untrusted input.

22

23

Constructs only basic YAML types and standard scalar types.

24

Cannot execute arbitrary Python code or access dangerous functionality.

25

Recommended for processing YAML from untrusted sources.

26

"""

27

28

class FullLoader(Reader, Scanner, Parser, Composer, FullConstructor, Resolver):

29

"""

30

Full loader with security restrictions.

31

32

Constructs most YAML types but prevents known dangerous operations.

33

Good balance between functionality and security.

34

Recommended for most use cases with trusted or semi-trusted input.

35

"""

36

37

class Loader(Reader, Scanner, Parser, Composer, Constructor, Resolver):

38

"""

39

Full-featured loader without security restrictions.

40

41

Can construct arbitrary Python objects and execute Python code.

42

Provides complete YAML functionality but is unsafe for untrusted input.

43

Identical to UnsafeLoader.

44

"""

45

46

class UnsafeLoader(Reader, Scanner, Parser, Composer, Constructor, Resolver):

47

"""

48

Explicitly unsafe loader.

49

50

Identical to Loader but with a name that clearly indicates the security risk.

51

Can execute arbitrary Python code during loading.

52

Only use with completely trusted input.

53

"""

54

```

55

56

### C Extension Loaders

57

58

High-performance C-based loaders available when LibYAML is installed:

59

60

```python { .api }

61

class CBaseLoader:

62

"""C-based BaseLoader implementation."""

63

64

class CSafeLoader:

65

"""C-based SafeLoader implementation."""

66

67

class CFullLoader:

68

"""C-based FullLoader implementation."""

69

70

class CLoader:

71

"""C-based Loader implementation."""

72

73

class CUnsafeLoader:

74

"""C-based UnsafeLoader implementation."""

75

```

76

77

### Dumper Classes

78

79

Different dumper classes provide varying levels of functionality and output compatibility.

80

81

```python { .api }

82

class BaseDumper(Emitter, Serializer, BaseRepresenter, BaseResolver):

83

"""

84

Base dumper with minimal functionality.

85

86

Can represent basic Python types using standard YAML tags.

87

Produces output that is compatible with any YAML parser.

88

"""

89

90

class SafeDumper(Emitter, Serializer, SafeRepresenter, Resolver):

91

"""

92

Safe dumper producing basic YAML output.

93

94

Represents only basic Python types and standard scalars.

95

Output is guaranteed to be safe for any YAML parser to consume.

96

Recommended for configuration files and data exchange.

97

"""

98

99

class Dumper(Emitter, Serializer, Representer, Resolver):

100

"""

101

Full-featured dumper with Python object support.

102

103

Can represent arbitrary Python objects using Python-specific YAML tags.

104

Output may not be readable by non-Python YAML parsers.

105

Use when preserving exact Python object types is important.

106

"""

107

```

108

109

### C Extension Dumpers

110

111

High-performance C-based dumpers available when LibYAML is installed:

112

113

```python { .api }

114

class CBaseDumper:

115

"""C-based BaseDumper implementation."""

116

117

class CSafeDumper:

118

"""C-based SafeDumper implementation."""

119

120

class CDumper:

121

"""C-based Dumper implementation."""

122

```

123

124

## Usage Examples

125

126

### Choosing the Right Loader

127

128

```python

129

import yaml

130

131

yaml_content = """

132

name: John Doe

133

birth_date: 1990-01-15

134

scores: [85, 92, 78]

135

metadata:

136

created: 2023-01-01T10:00:00Z

137

tags: !!python/list [tag1, tag2]

138

"""

139

140

# SafeLoader - only basic types, ignores Python-specific tags

141

try:

142

data_safe = yaml.load(yaml_content, yaml.SafeLoader)

143

print(f"birth_date type: {type(data_safe['birth_date'])}") # str

144

print(f"tags: {data_safe['metadata'].get('tags', 'Missing')}") # Missing

145

except yaml.ConstructorError as e:

146

print(f"SafeLoader error: {e}")

147

148

# FullLoader - more types but still restricted

149

data_full = yaml.load(yaml_content, yaml.FullLoader)

150

print(f"birth_date type: {type(data_full['birth_date'])}") # datetime.date

151

print(f"created type: {type(data_full['metadata']['created'])}") # datetime.datetime

152

153

# UnsafeLoader - can handle Python-specific tags (dangerous!)

154

data_unsafe = yaml.load(yaml_content, yaml.UnsafeLoader)

155

print(f"tags type: {type(data_unsafe['metadata']['tags'])}") # list

156

```

157

158

### Performance with C Extensions

159

160

```python

161

import yaml

162

import time

163

164

large_data = {'items': [{'id': i, 'value': f'item_{i}'} for i in range(10000)]}

165

166

# Check if C extensions are available

167

if yaml.__with_libyaml__:

168

print("LibYAML C extensions available")

169

170

# Benchmark Python vs C dumping

171

start = time.time()

172

yaml_py = yaml.dump(large_data, Dumper=yaml.Dumper)

173

py_time = time.time() - start

174

175

start = time.time()

176

yaml_c = yaml.dump(large_data, Dumper=yaml.CDumper)

177

c_time = time.time() - start

178

179

print(f"Python dumper: {py_time:.3f}s")

180

print(f"C dumper: {c_time:.3f}s")

181

print(f"Speedup: {py_time/c_time:.1f}x")

182

183

# Benchmark loading

184

start = time.time()

185

data_py = yaml.load(yaml_c, Loader=yaml.Loader)

186

py_load_time = time.time() - start

187

188

start = time.time()

189

data_c = yaml.load(yaml_c, Loader=yaml.CLoader)

190

c_load_time = time.time() - start

191

192

print(f"Python loader: {py_load_time:.3f}s")

193

print(f"C loader: {c_load_time:.3f}s")

194

print(f"Load speedup: {py_load_time/c_load_time:.1f}x")

195

else:

196

print("LibYAML C extensions not available")

197

```

198

199

### Creating Custom Loaders and Dumpers

200

201

```python

202

import yaml

203

from datetime import datetime

204

205

# Custom loader with additional constructor

206

class CustomLoader(yaml.SafeLoader):

207

pass

208

209

def timestamp_constructor(loader, node):

210

"""Custom constructor for timestamp format."""

211

value = loader.construct_scalar(node)

212

return datetime.fromisoformat(value.replace('Z', '+00:00'))

213

214

# Register custom constructor

215

CustomLoader.add_constructor('!timestamp', timestamp_constructor)

216

217

# Custom dumper with additional representer

218

class CustomDumper(yaml.SafeDumper):

219

pass

220

221

def timestamp_representer(dumper, data):

222

"""Custom representer for datetime objects."""

223

return dumper.represent_scalar('!timestamp', data.isoformat() + 'Z')

224

225

# Register custom representer

226

CustomDumper.add_representer(datetime, timestamp_representer)

227

228

# Usage

229

yaml_with_custom = """

230

created: !timestamp 2023-01-01T10:00:00Z

231

updated: !timestamp 2023-12-15T14:30:00Z

232

"""

233

234

data = yaml.load(yaml_with_custom, CustomLoader)

235

print(f"Created: {data['created']} ({type(data['created'])})")

236

237

# Dump back with custom format

238

output = yaml.dump(data, CustomDumper)

239

print(output)

240

```

241

242

## Security Comparison

243

244

| Loader | Security | Features | Use Cases |

245

|--------|----------|----------|-----------|

246

| SafeLoader | Highest | Basic types only | Untrusted input, config files |

247

| FullLoader | High | Most types, restricted | Semi-trusted input, data exchange |

248

| Loader/UnsafeLoader | None | All features | Trusted input, object persistence |

249

250

### Type Support by Loader

251

252

| Python Type | SafeLoader | FullLoader | Loader/UnsafeLoader |

253

|-------------|------------|------------|---------------------|

254

| str, int, float, bool, None ||||

255

| list, dict ||||

256

| datetime.date ||||

257

| datetime.datetime ||||

258

| set, tuple ||||

259

| Arbitrary Python objects ||||

260

| Function calls ||||

261

262

## Component Architecture

263

264

Loaders and dumpers are composed of multiple processing components:

265

266

### Loader Components

267

268

- **Reader**: Input stream handling and encoding detection

269

- **Scanner**: Tokenization (character stream → tokens)

270

- **Parser**: Syntax analysis (tokens → events)

271

- **Composer**: Tree building (events → representation nodes)

272

- **Constructor**: Object construction (nodes → Python objects)

273

- **Resolver**: Tag resolution and type detection

274

275

### Dumper Components

276

277

- **Representer**: Object representation (Python objects → nodes)

278

- **Serializer**: Tree serialization (nodes → events)

279

- **Emitter**: Text generation (events → YAML text)

280

- **Resolver**: Tag resolution for output

281

282

### Inheritance Hierarchy

283

284

```python

285

# Example of how loaders combine components

286

class SafeLoader(

287

Reader, # Input handling

288

Scanner, # Tokenization

289

Parser, # Parsing

290

Composer, # Tree composition

291

SafeConstructor, # Safe object construction

292

Resolver # Tag resolution

293

):

294

pass

295

```

296

297

This modular design allows for:

298

- Easy customization by inheriting and overriding specific components

299

- Mix-and-match functionality from different security levels

300

- Adding custom constructors and representers

301

- Fine-grained control over processing pipeline

302

303

## Best Practices

304

305

### Security Guidelines

306

307

1. **Default to SafeLoader** for any external input

308

2. **Use FullLoader** for internal configuration with known structure

309

3. **Only use Loader/UnsafeLoader** with completely trusted input

310

4. **Never use unsafe loaders** with user-provided data

311

312

### Performance Guidelines

313

314

1. **Use C extensions** when available for large documents

315

2. **Choose appropriate loader** - don't use more features than needed

316

3. **Stream processing** for very large documents

317

4. **Reuse loader instances** when processing multiple similar documents

318

319

### Compatibility Guidelines

320

321

1. **Use SafeDumper output** for maximum compatibility

322

2. **Avoid Python-specific tags** in exchanged data

323

3. **Test with different parsers** if targeting non-Python consumers

324

4. **Document loader requirements** when distributing YAML files