or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

customization.mddumping-serialization.mderror-handling.mdindex.mdloaders-dumpers.mdloading-parsing.mdsafe-operations.md

customization.mddocs/

0

# Customization and Extension

1

2

Advanced customization capabilities for extending YAML processing with custom constructors, representers, and resolvers. Tailor PyYAML behavior to handle custom data types and implement domain-specific YAML formats.

3

4

## Capabilities

5

6

### Constructor Management

7

8

Add custom constructors to handle specific YAML tags and convert them to Python objects during loading.

9

10

```python { .api }

11

def add_constructor(tag, constructor, Loader=None):

12

"""

13

Add a constructor for the given tag.

14

15

Args:

16

tag (str): YAML tag to handle (e.g., '!custom', 'tag:example.com,2000:app/custom')

17

constructor (Callable): Function that accepts (loader, node) and returns Python object

18

Loader (type, optional): Specific loader class to add to. If None, adds to multiple loaders.

19

20

Constructor Function Signature:

21

def constructor(loader: BaseLoader, node: Node) -> Any

22

"""

23

24

def add_multi_constructor(tag_prefix, multi_constructor, Loader=None):

25

"""

26

Add a multi-constructor for the given tag prefix.

27

28

Multi-constructor is called for any tag that starts with the specified prefix.

29

30

Args:

31

tag_prefix (str): Tag prefix to match (e.g., '!custom:', 'tag:example.com,2000:app/')

32

multi_constructor (Callable): Function that accepts (loader, tag_suffix, node)

33

Loader (type, optional): Specific loader class to add to

34

35

Multi-Constructor Function Signature:

36

def multi_constructor(loader: BaseLoader, tag_suffix: str, node: Node) -> Any

37

"""

38

```

39

40

### Representer Management

41

42

Add custom representers to control how Python objects are converted to YAML during dumping.

43

44

```python { .api }

45

def add_representer(data_type, representer, Dumper=Dumper):

46

"""

47

Add a representer for the given type.

48

49

Args:

50

data_type (type): Python type to represent

51

representer (Callable): Function that accepts (dumper, data) and returns Node

52

Dumper (type, optional): Dumper class to add to (default: Dumper)

53

54

Representer Function Signature:

55

def representer(dumper: BaseDumper, data: Any) -> Node

56

"""

57

58

def add_multi_representer(data_type, multi_representer, Dumper=Dumper):

59

"""

60

Add a representer for the given type and its subclasses.

61

62

Multi-representer handles the specified type and all its subclasses.

63

64

Args:

65

data_type (type): Base Python type to represent

66

multi_representer (Callable): Function that accepts (dumper, data) and returns Node

67

Dumper (type, optional): Dumper class to add to

68

69

Multi-Representer Function Signature:

70

def multi_representer(dumper: BaseDumper, data: Any) -> Node

71

"""

72

```

73

74

### Resolver Management

75

76

Add custom resolvers to automatically detect and tag scalar values based on patterns.

77

78

```python { .api }

79

def add_implicit_resolver(tag, regexp, first=None, Loader=None, Dumper=Dumper):

80

"""

81

Add an implicit scalar detector.

82

83

If a scalar value matches the given regexp, the corresponding tag is assigned.

84

85

Args:

86

tag (str): YAML tag to assign when pattern matches

87

regexp (re.Pattern): Regular expression to match scalar values

88

first (str, optional): Sequence of possible first characters for optimization

89

Loader (type, optional): Loader class to add to

90

Dumper (type, optional): Dumper class to add to

91

"""

92

93

def add_path_resolver(tag, path, kind=None, Loader=None, Dumper=Dumper):

94

"""

95

Add a path-based resolver for the given tag.

96

97

A path is a list of keys that forms a path to a node in the representation tree.

98

99

Args:

100

tag (str): YAML tag to assign when path matches

101

path (list): List of keys forming path to node (strings, integers, or None)

102

kind (type, optional): Node type to match (ScalarNode, SequenceNode, MappingNode)

103

Loader (type, optional): Loader class to add to

104

Dumper (type, optional): Dumper class to add to

105

"""

106

```

107

108

## Usage Examples

109

110

### Custom Data Types

111

112

```python

113

import yaml

114

from decimal import Decimal

115

from datetime import datetime

116

import re

117

118

# Custom constructor for Decimal type

119

def decimal_constructor(loader, node):

120

"""Convert YAML scalar to Decimal."""

121

value = loader.construct_scalar(node)

122

return Decimal(value)

123

124

# Custom representer for Decimal type

125

def decimal_representer(dumper, data):

126

"""Convert Decimal to YAML scalar."""

127

return dumper.represent_scalar('!decimal', str(data))

128

129

# Register custom handlers

130

yaml.add_constructor('!decimal', decimal_constructor)

131

yaml.add_representer(Decimal, decimal_representer)

132

133

# Usage

134

yaml_content = """

135

price: !decimal 19.99

136

tax_rate: !decimal 0.08

137

"""

138

139

data = yaml.load(yaml_content, yaml.Loader)

140

print(f"Price: {data['price']} ({type(data['price'])})") # Decimal

141

142

# Dump back to YAML

143

output_data = {'total': Decimal('27.50'), 'discount': Decimal('5.00')}

144

yaml_output = yaml.dump(output_data)

145

print(yaml_output)

146

# discount: !decimal 5.00

147

# total: !decimal 27.50

148

```

149

150

### Multi-Constructor Example

151

152

```python

153

import yaml

154

155

def env_constructor(loader, tag_suffix, node):

156

"""Constructor for environment variables with different types."""

157

value = loader.construct_scalar(node)

158

159

if tag_suffix == 'str':

160

return str(value)

161

elif tag_suffix == 'int':

162

return int(value)

163

elif tag_suffix == 'bool':

164

return value.lower() in ('true', '1', 'yes', 'on')

165

elif tag_suffix == 'list':

166

return value.split(',')

167

else:

168

return value

169

170

# Register multi-constructor for !env: prefix

171

yaml.add_multi_constructor('!env:', env_constructor)

172

173

yaml_content = """

174

database_host: !env:str localhost

175

database_port: !env:int 5432

176

debug_mode: !env:bool true

177

allowed_hosts: !env:list host1,host2,host3

178

"""

179

180

data = yaml.load(yaml_content, yaml.Loader)

181

print(f"Port: {data['database_port']} ({type(data['database_port'])})") # int

182

print(f"Debug: {data['debug_mode']} ({type(data['debug_mode'])})") # bool

183

print(f"Hosts: {data['allowed_hosts']}") # ['host1', 'host2', 'host3']

184

```

185

186

### Implicit Resolvers

187

188

```python

189

import yaml

190

import re

191

192

# Add resolver for email addresses

193

email_pattern = re.compile(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$')

194

yaml.add_implicit_resolver('!email', email_pattern, ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'])

195

196

# Constructor for email addresses

197

def email_constructor(loader, node):

198

value = loader.construct_scalar(node)

199

return {'email': value, 'domain': value.split('@')[1]}

200

201

yaml.add_constructor('!email', email_constructor)

202

203

yaml_content = """

204

admin: admin@example.com

205

support: support@company.org

206

"""

207

208

data = yaml.load(yaml_content, yaml.Loader)

209

print(f"Admin: {data['admin']}") # {'email': 'admin@example.com', 'domain': 'example.com'}

210

```

211

212

### Path Resolvers

213

214

```python

215

import yaml

216

from yaml.nodes import ScalarNode

217

218

# Add path resolver for configuration values

219

yaml.add_path_resolver('!config', ['config', None], ScalarNode)

220

221

def config_constructor(loader, node):

222

"""Special handling for config values."""

223

value = loader.construct_scalar(node)

224

return f"CONFIG:{value}"

225

226

yaml.add_constructor('!config', config_constructor)

227

228

yaml_content = """

229

config:

230

database_url: postgresql://localhost/myapp

231

api_key: secret123

232

timeout: 30

233

"""

234

235

data = yaml.load(yaml_content, yaml.Loader)

236

print(data['config']['database_url']) # CONFIG:postgresql://localhost/myapp

237

```

238

239

### Custom Loader and Dumper Classes

240

241

```python

242

import yaml

243

from datetime import datetime

244

import json

245

246

class ApplicationLoader(yaml.SafeLoader):

247

"""Custom loader for application-specific YAML."""

248

pass

249

250

class ApplicationDumper(yaml.SafeDumper):

251

"""Custom dumper for application-specific YAML."""

252

pass

253

254

# JSON constructor

255

def json_constructor(loader, node):

256

"""Parse JSON embedded in YAML."""

257

value = loader.construct_scalar(node)

258

return json.loads(value)

259

260

# JSON representer

261

def json_representer(dumper, data):

262

"""Represent dict as embedded JSON."""

263

return dumper.represent_scalar('!json', json.dumps(data))

264

265

# Register with custom classes

266

ApplicationLoader.add_constructor('!json', json_constructor)

267

ApplicationDumper.add_representer(dict, json_representer)

268

269

# Timestamp constructor

270

def timestamp_constructor(loader, node):

271

value = loader.construct_scalar(node)

272

return datetime.fromisoformat(value)

273

274

ApplicationLoader.add_constructor('!timestamp', timestamp_constructor)

275

276

yaml_content = """

277

metadata: !json {"version": "1.0", "author": "Developer"}

278

created: !timestamp 2023-01-01T10:00:00

279

"""

280

281

data = yaml.load(yaml_content, ApplicationLoader)

282

print(f"Metadata: {data['metadata']}") # {'version': '1.0', 'author': 'Developer'}

283

print(f"Created: {data['created']}") # datetime object

284

```

285

286

## Advanced Customization Patterns

287

288

### YAMLObject Base Class

289

290

Create self-serializing objects using the YAMLObject base class:

291

292

```python

293

import yaml

294

295

class Person(yaml.YAMLObject):

296

yaml_tag = '!Person'

297

yaml_loader = yaml.Loader

298

yaml_dumper = yaml.Dumper

299

300

def __init__(self, name, age, email):

301

self.name = name

302

self.age = age

303

self.email = email

304

305

def __repr__(self):

306

return f"Person(name={self.name!r}, age={self.age!r}, email={self.email!r})"

307

308

# Usage - automatic registration

309

yaml_content = """

310

person: !Person

311

name: John Doe

312

age: 30

313

email: john@example.com

314

"""

315

316

data = yaml.load(yaml_content, yaml.Loader)

317

print(data['person']) # Person(name='John Doe', age=30, email='john@example.com')

318

319

# Automatic dumping

320

person = Person("Jane Smith", 25, "jane@example.com")

321

yaml_output = yaml.dump({'employee': person})

322

print(yaml_output)

323

```

324

325

### State-Aware Constructors

326

327

```python

328

import yaml

329

330

class DatabaseConfig:

331

def __init__(self, host, port, database):

332

self.host = host

333

self.port = port

334

self.database = database

335

self.connection_string = f"postgresql://{host}:{port}/{database}"

336

337

def database_constructor(loader, node):

338

"""Constructor that maintains parsing state."""

339

# Get the mapping as a dictionary

340

config = loader.construct_mapping(node, deep=True)

341

342

# Validate required fields

343

required = ['host', 'port', 'database']

344

missing = [field for field in required if field not in config]

345

if missing:

346

raise yaml.ConstructorError(

347

None, None,

348

f"Missing required fields: {missing}",

349

node.start_mark

350

)

351

352

return DatabaseConfig(

353

host=config['host'],

354

port=config['port'],

355

database=config['database']

356

)

357

358

yaml.add_constructor('!database', database_constructor)

359

360

yaml_content = """

361

prod_db: !database

362

host: prod.example.com

363

port: 5432

364

database: production

365

"""

366

367

data = yaml.load(yaml_content, yaml.Loader)

368

print(data['prod_db'].connection_string)

369

```

370

371

### Dynamic Tag Generation

372

373

```python

374

import yaml

375

376

class VersionedData:

377

def __init__(self, version, data):

378

self.version = version

379

self.data = data

380

381

def versioned_multi_constructor(loader, tag_suffix, node):

382

"""Handle versioned data tags like !v1.0, !v2.0, etc."""

383

version = tag_suffix

384

data = loader.construct_mapping(node, deep=True)

385

return VersionedData(version, data)

386

387

def versioned_representer(dumper, data):

388

"""Represent versioned data with appropriate tag."""

389

tag = f'!v{data.version}'

390

return dumper.represent_mapping(tag, data.data)

391

392

yaml.add_multi_constructor('!v', versioned_multi_constructor)

393

yaml.add_representer(VersionedData, versioned_representer)

394

395

yaml_content = """

396

config: !v1.2

397

api_endpoint: /api/v1

398

features: [auth, logging]

399

"""

400

401

data = yaml.load(yaml_content, yaml.Loader)

402

print(f"Version: {data['config'].version}") # 1.2

403

print(f"Features: {data['config'].data['features']}") # ['auth', 'logging']

404

```

405

406

## Best Practices

407

408

### Security Considerations

409

410

1. **Validate input** in custom constructors

411

2. **Use SafeLoader as base** for custom loaders when possible

412

3. **Avoid dangerous operations** in constructors (file I/O, subprocess, etc.)

413

4. **Sanitize tag names** to prevent injection attacks

414

415

### Performance Tips

416

417

1. **Use first parameter** in implicit resolvers for optimization

418

2. **Cache compiled regexes** in resolver functions

419

3. **Minimize object creation** in frequently-used constructors

420

4. **Prefer multi-constructors** over many individual constructors

421

422

### Maintainability

423

424

1. **Document custom tags** and their expected format

425

2. **Provide validation** in constructors with clear error messages

426

3. **Use descriptive tag names** that indicate purpose

427

4. **Group related customizations** in custom loader/dumper classes

428

429

### Compatibility

430

431

1. **Test with different PyYAML versions** when using advanced features

432

2. **Provide fallbacks** for missing custom tags

433

3. **Document dependencies** when sharing customized YAML files

434

4. **Consider standard YAML tags** before creating custom ones