or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

error-handling.mdhtml-utilities.mdindex.mdparsing.mdserialization.mdtokenization.mdtree-adapters.md

serialization.mddocs/

0

# HTML Serialization

1

2

HTML serialization functionality for converting parsed AST nodes back to HTML strings. The serializer handles proper HTML formatting, void elements, and namespace-aware serialization.

3

4

## Capabilities

5

6

### Inner Content Serialization

7

8

Serializes the inner content of a node (children only) to an HTML string.

9

10

```typescript { .api }

11

/**

12

* Serializes an AST node's inner content to an HTML string

13

* @param node - Parent node whose children will be serialized

14

* @param options - Optional serialization configuration

15

* @returns HTML string representing the node's inner content

16

*/

17

function serialize<T extends TreeAdapterTypeMap = DefaultTreeAdapterMap>(

18

node: T['parentNode'],

19

options?: SerializerOptions<T>

20

): string;

21

```

22

23

**Usage Examples:**

24

25

```typescript

26

import { parse, serialize } from "parse5";

27

28

// Parse and serialize document content

29

const document = parse('<!DOCTYPE html><html><head></head><body>Hi there!</body></html>');

30

const html = serialize(document);

31

console.log(html); // '<html><head></head><body>Hi there!</body></html>'

32

33

// Serialize element's inner content

34

const bodyElement = document.childNodes[1].childNodes[1]; // html > body

35

const bodyContent = serialize(bodyElement);

36

console.log(bodyContent); // 'Hi there!'

37

38

// Serialize with custom tree adapter

39

import { htmlparser2TreeAdapter } from "parse5-htmlparser2-tree-adapter";

40

const customHtml = serialize(document, {

41

treeAdapter: htmlparser2TreeAdapter

42

});

43

```

44

45

### Outer Element Serialization

46

47

Serializes an element including the element tag itself (outerHTML equivalent).

48

49

```typescript { .api }

50

/**

51

* Serializes an element including its opening and closing tags

52

* @param node - Element node to serialize completely

53

* @param options - Optional serialization configuration

54

* @returns HTML string including the element's outer tags

55

*/

56

function serializeOuter<T extends TreeAdapterTypeMap = DefaultTreeAdapterMap>(

57

node: T['node'],

58

options?: SerializerOptions<T>

59

): string;

60

```

61

62

**Usage Examples:**

63

64

```typescript

65

import { parseFragment, serializeOuter } from "parse5";

66

67

// Parse fragment and serialize complete element

68

const fragment = parseFragment('<div class="container"><span>Hello</span></div>');

69

const divElement = fragment.childNodes[0];

70

const outerHTML = serializeOuter(divElement);

71

console.log(outerHTML); // '<div class="container"><span>Hello</span></div>'

72

73

// Serialize nested elements

74

const complexFragment = parseFragment(`

75

<article data-id="123">

76

<header><h1>Title</h1></header>

77

<section><p>Content paragraph</p></section>

78

</article>

79

`);

80

const articleElement = complexFragment.childNodes[0];

81

const fullArticle = serializeOuter(articleElement);

82

```

83

84

### Serialization Options

85

86

Control serialization behavior through configuration options.

87

88

```typescript { .api }

89

interface SerializerOptions<T extends TreeAdapterTypeMap> {

90

/**

91

* Specifies input tree format. Defaults to the default tree adapter.

92

*/

93

treeAdapter?: TreeAdapter<T>;

94

95

/**

96

* The scripting flag. If set to true, noscript element content

97

* will not be escaped. Defaults to true.

98

*/

99

scriptingEnabled?: boolean;

100

}

101

```

102

103

**Usage Examples:**

104

105

```typescript

106

import { parse, serialize, parseFragment, serializeOuter } from "parse5";

107

108

// Serialize with scripting disabled

109

const docWithNoscript = parse('<html><body><noscript>No JS content</noscript></body></html>');

110

const htmlWithoutScripting = serialize(docWithNoscript, {

111

scriptingEnabled: false

112

});

113

114

// Use custom tree adapter

115

import { customTreeAdapter } from "./my-tree-adapter";

116

const customSerialized = serialize(docWithNoscript, {

117

treeAdapter: customTreeAdapter

118

});

119

```

120

121

## Serialization Behavior

122

123

### Void Elements

124

125

The serializer properly handles void elements (self-closing tags):

126

127

```typescript

128

import { parseFragment, serializeOuter } from "parse5";

129

130

// Void elements are serialized without closing tags

131

const fragment = parseFragment('<img src="image.jpg" alt="Image"><br><input type="text">');

132

const serialized = serialize(fragment);

133

console.log(serialized); // '<img src="image.jpg" alt="Image"><br><input type="text">'

134

135

// Even if child nodes are added programmatically, void elements ignore them

136

const brElement = parseFragment('<br>').childNodes[0];

137

// Adding children to void elements has no effect during serialization

138

const brSerialized = serializeOuter(brElement);

139

console.log(brSerialized); // '<br>' (children are ignored)

140

```

141

142

### Namespace Handling

143

144

The serializer handles XML namespaces correctly:

145

146

```typescript

147

import { parseFragment, serialize } from "parse5";

148

149

// SVG and MathML namespaces are preserved

150

const svgFragment = parseFragment(`

151

<svg xmlns="http://www.w3.org/2000/svg">

152

<circle cx="50" cy="50" r="40"/>

153

</svg>

154

`);

155

const svgSerialized = serialize(svgFragment);

156

157

// Namespace declarations and prefixes are maintained

158

const xmlFragment = parseFragment('<root xmlns:custom="http://example.com/ns"><custom:element/></root>');

159

const xmlSerialized = serialize(xmlFragment);

160

```

161

162

### Attribute Serialization

163

164

Attributes are properly escaped and formatted:

165

166

```typescript

167

import { parseFragment, serializeOuter } from "parse5";

168

169

// Special characters in attributes are escaped

170

const fragment = parseFragment('<div title="Quote: &quot;Hello&quot;" data-value=\'Single "quotes"\'></div>');

171

const element = fragment.childNodes[0];

172

const serialized = serializeOuter(element);

173

console.log(serialized); // Attributes properly escaped

174

175

// Boolean attributes

176

const inputFragment = parseFragment('<input type="checkbox" checked disabled>');

177

const inputSerialized = serialize(inputFragment);

178

console.log(inputSerialized); // '<input type="checkbox" checked disabled>'

179

```

180

181

### Text Content Escaping

182

183

Text content is automatically escaped:

184

185

```typescript

186

import { parseFragment, serialize } from "parse5";

187

188

// Special HTML characters are escaped in text content

189

const fragment = parseFragment('<p>Text with &lt;script&gt; and &amp; entities</p>');

190

const serialized = serialize(fragment);

191

// Text content maintains proper escaping

192

193

// Script and style elements preserve their content

194

const scriptFragment = parseFragment('<script>if (x < y && y > z) { /* code */ }</script>');

195

const scriptSerialized = serialize(scriptFragment);

196

// Script content is not double-escaped

197

```

198

199

### Template Element Handling

200

201

Template elements receive special handling:

202

203

```typescript

204

import { parseFragment, serialize, serializeOuter } from "parse5";

205

206

// Template content is serialized as inner content

207

const templateFragment = parseFragment('<template><div>Template content</div></template>');

208

const templateElement = templateFragment.childNodes[0];

209

210

// serialize() on template element returns the template's inner content

211

const templateInner = serialize(templateElement);

212

console.log(templateInner); // '<div>Template content</div>'

213

214

// serializeOuter() includes the template tags

215

const templateOuter = serializeOuter(templateElement);

216

console.log(templateOuter); // '<template><div>Template content</div></template>'

217

```

218

219

## Common Serialization Patterns

220

221

### Round-trip Parsing and Serialization

222

223

```typescript

224

import { parse, serialize } from "parse5";

225

226

// Parse HTML and serialize back - should be equivalent

227

const originalHtml = '<!DOCTYPE html><html><head><title>Test</title></head><body><div class="content">Hello World</div></body></html>';

228

const document = parse(originalHtml);

229

const serializedHtml = serialize(document);

230

231

// The serialized HTML maintains the same structure

232

// (though formatting may differ slightly)

233

```

234

235

### Selective Content Serialization

236

237

```typescript

238

import { parseFragment, serialize } from "parse5";

239

240

// Parse complex structure and serialize specific parts

241

const complexFragment = parseFragment(`

242

<article>

243

<header><h1>Article Title</h1></header>

244

<section class="content">

245

<p>First paragraph</p>

246

<p>Second paragraph</p>

247

</section>

248

<footer>Article footer</footer>

249

</article>

250

`);

251

252

const article = complexFragment.childNodes[0];

253

const contentSection = article.childNodes[1]; // section.content

254

const contentHtml = serialize(contentSection);

255

// Returns only the content section's inner HTML

256

```

257

258

### HTML Cleaning and Transformation

259

260

```typescript

261

import { parse, serialize } from "parse5";

262

263

// Parse potentially malformed HTML and serialize clean output

264

const messyHtml = '<div><p>Unclosed paragraph<span>Nested content<div>Misplaced div</div>';

265

const document = parse(messyHtml);

266

const cleanHtml = serialize(document);

267

// Results in properly structured, valid HTML

268

```