0
# HTML Serialization
1
2
HTML serialization functionality for converting parsed AST nodes back to HTML strings. The serializer handles proper HTML formatting, void elements, and namespace-aware serialization.
3
4
## Capabilities
5
6
### Inner Content Serialization
7
8
Serializes the inner content of a node (children only) to an HTML string.
9
10
```typescript { .api }
11
/**
12
* Serializes an AST node's inner content to an HTML string
13
* @param node - Parent node whose children will be serialized
14
* @param options - Optional serialization configuration
15
* @returns HTML string representing the node's inner content
16
*/
17
function serialize<T extends TreeAdapterTypeMap = DefaultTreeAdapterMap>(
18
node: T['parentNode'],
19
options?: SerializerOptions<T>
20
): string;
21
```
22
23
**Usage Examples:**
24
25
```typescript
26
import { parse, serialize } from "parse5";
27
28
// Parse and serialize document content
29
const document = parse('<!DOCTYPE html><html><head></head><body>Hi there!</body></html>');
30
const html = serialize(document);
31
console.log(html); // '<html><head></head><body>Hi there!</body></html>'
32
33
// Serialize element's inner content
34
const bodyElement = document.childNodes[1].childNodes[1]; // html > body
35
const bodyContent = serialize(bodyElement);
36
console.log(bodyContent); // 'Hi there!'
37
38
// Serialize with custom tree adapter
39
import { htmlparser2TreeAdapter } from "parse5-htmlparser2-tree-adapter";
40
const customHtml = serialize(document, {
41
treeAdapter: htmlparser2TreeAdapter
42
});
43
```
44
45
### Outer Element Serialization
46
47
Serializes an element including the element tag itself (outerHTML equivalent).
48
49
```typescript { .api }
50
/**
51
* Serializes an element including its opening and closing tags
52
* @param node - Element node to serialize completely
53
* @param options - Optional serialization configuration
54
* @returns HTML string including the element's outer tags
55
*/
56
function serializeOuter<T extends TreeAdapterTypeMap = DefaultTreeAdapterMap>(
57
node: T['node'],
58
options?: SerializerOptions<T>
59
): string;
60
```
61
62
**Usage Examples:**
63
64
```typescript
65
import { parseFragment, serializeOuter } from "parse5";
66
67
// Parse fragment and serialize complete element
68
const fragment = parseFragment('<div class="container"><span>Hello</span></div>');
69
const divElement = fragment.childNodes[0];
70
const outerHTML = serializeOuter(divElement);
71
console.log(outerHTML); // '<div class="container"><span>Hello</span></div>'
72
73
// Serialize nested elements
74
const complexFragment = parseFragment(`
75
<article data-id="123">
76
<header><h1>Title</h1></header>
77
<section><p>Content paragraph</p></section>
78
</article>
79
`);
80
const articleElement = complexFragment.childNodes[0];
81
const fullArticle = serializeOuter(articleElement);
82
```
83
84
### Serialization Options
85
86
Control serialization behavior through configuration options.
87
88
```typescript { .api }
89
interface SerializerOptions<T extends TreeAdapterTypeMap> {
90
/**
91
* Specifies input tree format. Defaults to the default tree adapter.
92
*/
93
treeAdapter?: TreeAdapter<T>;
94
95
/**
96
* The scripting flag. If set to true, noscript element content
97
* will not be escaped. Defaults to true.
98
*/
99
scriptingEnabled?: boolean;
100
}
101
```
102
103
**Usage Examples:**
104
105
```typescript
106
import { parse, serialize, parseFragment, serializeOuter } from "parse5";
107
108
// Serialize with scripting disabled
109
const docWithNoscript = parse('<html><body><noscript>No JS content</noscript></body></html>');
110
const htmlWithoutScripting = serialize(docWithNoscript, {
111
scriptingEnabled: false
112
});
113
114
// Use custom tree adapter
115
import { customTreeAdapter } from "./my-tree-adapter";
116
const customSerialized = serialize(docWithNoscript, {
117
treeAdapter: customTreeAdapter
118
});
119
```
120
121
## Serialization Behavior
122
123
### Void Elements
124
125
The serializer properly handles void elements (self-closing tags):
126
127
```typescript
128
import { parseFragment, serializeOuter } from "parse5";
129
130
// Void elements are serialized without closing tags
131
const fragment = parseFragment('<img src="image.jpg" alt="Image"><br><input type="text">');
132
const serialized = serialize(fragment);
133
console.log(serialized); // '<img src="image.jpg" alt="Image"><br><input type="text">'
134
135
// Even if child nodes are added programmatically, void elements ignore them
136
const brElement = parseFragment('<br>').childNodes[0];
137
// Adding children to void elements has no effect during serialization
138
const brSerialized = serializeOuter(brElement);
139
console.log(brSerialized); // '<br>' (children are ignored)
140
```
141
142
### Namespace Handling
143
144
The serializer handles XML namespaces correctly:
145
146
```typescript
147
import { parseFragment, serialize } from "parse5";
148
149
// SVG and MathML namespaces are preserved
150
const svgFragment = parseFragment(`
151
<svg xmlns="http://www.w3.org/2000/svg">
152
<circle cx="50" cy="50" r="40"/>
153
</svg>
154
`);
155
const svgSerialized = serialize(svgFragment);
156
157
// Namespace declarations and prefixes are maintained
158
const xmlFragment = parseFragment('<root xmlns:custom="http://example.com/ns"><custom:element/></root>');
159
const xmlSerialized = serialize(xmlFragment);
160
```
161
162
### Attribute Serialization
163
164
Attributes are properly escaped and formatted:
165
166
```typescript
167
import { parseFragment, serializeOuter } from "parse5";
168
169
// Special characters in attributes are escaped
170
const fragment = parseFragment('<div title="Quote: "Hello"" data-value=\'Single "quotes"\'></div>');
171
const element = fragment.childNodes[0];
172
const serialized = serializeOuter(element);
173
console.log(serialized); // Attributes properly escaped
174
175
// Boolean attributes
176
const inputFragment = parseFragment('<input type="checkbox" checked disabled>');
177
const inputSerialized = serialize(inputFragment);
178
console.log(inputSerialized); // '<input type="checkbox" checked disabled>'
179
```
180
181
### Text Content Escaping
182
183
Text content is automatically escaped:
184
185
```typescript
186
import { parseFragment, serialize } from "parse5";
187
188
// Special HTML characters are escaped in text content
189
const fragment = parseFragment('<p>Text with <script> and & entities</p>');
190
const serialized = serialize(fragment);
191
// Text content maintains proper escaping
192
193
// Script and style elements preserve their content
194
const scriptFragment = parseFragment('<script>if (x < y && y > z) { /* code */ }</script>');
195
const scriptSerialized = serialize(scriptFragment);
196
// Script content is not double-escaped
197
```
198
199
### Template Element Handling
200
201
Template elements receive special handling:
202
203
```typescript
204
import { parseFragment, serialize, serializeOuter } from "parse5";
205
206
// Template content is serialized as inner content
207
const templateFragment = parseFragment('<template><div>Template content</div></template>');
208
const templateElement = templateFragment.childNodes[0];
209
210
// serialize() on template element returns the template's inner content
211
const templateInner = serialize(templateElement);
212
console.log(templateInner); // '<div>Template content</div>'
213
214
// serializeOuter() includes the template tags
215
const templateOuter = serializeOuter(templateElement);
216
console.log(templateOuter); // '<template><div>Template content</div></template>'
217
```
218
219
## Common Serialization Patterns
220
221
### Round-trip Parsing and Serialization
222
223
```typescript
224
import { parse, serialize } from "parse5";
225
226
// Parse HTML and serialize back - should be equivalent
227
const originalHtml = '<!DOCTYPE html><html><head><title>Test</title></head><body><div class="content">Hello World</div></body></html>';
228
const document = parse(originalHtml);
229
const serializedHtml = serialize(document);
230
231
// The serialized HTML maintains the same structure
232
// (though formatting may differ slightly)
233
```
234
235
### Selective Content Serialization
236
237
```typescript
238
import { parseFragment, serialize } from "parse5";
239
240
// Parse complex structure and serialize specific parts
241
const complexFragment = parseFragment(`
242
<article>
243
<header><h1>Article Title</h1></header>
244
<section class="content">
245
<p>First paragraph</p>
246
<p>Second paragraph</p>
247
</section>
248
<footer>Article footer</footer>
249
</article>
250
`);
251
252
const article = complexFragment.childNodes[0];
253
const contentSection = article.childNodes[1]; // section.content
254
const contentHtml = serialize(contentSection);
255
// Returns only the content section's inner HTML
256
```
257
258
### HTML Cleaning and Transformation
259
260
```typescript
261
import { parse, serialize } from "parse5";
262
263
// Parse potentially malformed HTML and serialize clean output
264
const messyHtml = '<div><p>Unclosed paragraph<span>Nested content<div>Misplaced div</div>';
265
const document = parse(messyHtml);
266
const cleanHtml = serialize(document);
267
// Results in properly structured, valid HTML
268
```