0
# Turndown
1
2
Turndown is a JavaScript library that converts HTML to Markdown. It provides a configurable service with extensive customization options for heading styles, list markers, code block formatting, link styles, and text emphasis delimiters. The library features a rule-based conversion system and plugin architecture for extending functionality.
3
4
## Package Information
5
6
- **Package Name**: turndown
7
- **Package Type**: npm
8
- **Language**: JavaScript
9
- **Installation**: `npm install turndown`
10
11
## Core Imports
12
13
```javascript
14
// CommonJS (Node.js) - Primary import method
15
const TurndownService = require('turndown');
16
17
// ES Modules (if using bundler that supports it)
18
import TurndownService from 'turndown';
19
```
20
21
Browser usage:
22
```html
23
<script src="https://unpkg.com/turndown/dist/turndown.js"></script>
24
<!-- TurndownService is available as a global -->
25
```
26
27
UMD usage:
28
```javascript
29
// RequireJS
30
define(['turndown'], function(TurndownService) {
31
// Use TurndownService
32
});
33
```
34
35
## Basic Usage
36
37
```javascript
38
const TurndownService = require('turndown');
39
40
const turndownService = new TurndownService();
41
const markdown = turndownService.turndown('<h1>Hello world!</h1>');
42
console.log(markdown); // "Hello world\n==========="
43
44
// With options
45
const turndownService = new TurndownService({
46
headingStyle: 'atx',
47
codeBlockStyle: 'fenced'
48
});
49
50
const html = '<h2>Example</h2><p>Convert <strong>HTML</strong> to <em>Markdown</em></p>';
51
const markdown = turndownService.turndown(html);
52
```
53
54
## Architecture
55
56
Turndown is built around several key components:
57
58
- **TurndownService**: Main class providing the conversion API and configuration management
59
- **Rule System**: Flexible rule-based conversion engine that determines how each HTML element is converted
60
- **Plugin System**: Extensible architecture allowing custom functionality through plugins
61
- **Options System**: Comprehensive configuration for customizing output format and style
62
- **Cross-platform Parsing**: HTML parsing that works in both browser and Node.js environments
63
64
## Capabilities
65
66
### TurndownService Constructor
67
68
Creates a new TurndownService instance with optional configuration.
69
70
```javascript { .api }
71
/**
72
* TurndownService constructor
73
* @param {TurndownOptions} options - Optional configuration object
74
* @returns {TurndownService} New TurndownService instance
75
*/
76
function TurndownService(options)
77
78
// Can also be called without 'new'
79
const turndownService = TurndownService(options);
80
```
81
82
### HTML Conversion
83
84
Core HTML to Markdown conversion functionality with support for all standard HTML elements and DOM nodes.
85
86
```javascript { .api }
87
/**
88
* Convert HTML string or DOM node to Markdown
89
* @param {string|HTMLElement|Document|DocumentFragment} input - HTML to convert
90
* @returns {string} Markdown representation of the input
91
*/
92
turndown(input)
93
```
94
95
### Configuration Options
96
97
Comprehensive configuration system for customizing Markdown output format and style.
98
99
```javascript { .api }
100
/**
101
* TurndownService constructor options
102
*/
103
interface TurndownOptions {
104
headingStyle?: 'setext' | 'atx'; // Default: 'setext'
105
hr?: string; // Default: '* * *'
106
bulletListMarker?: '*' | '-' | '+'; // Default: '*'
107
codeBlockStyle?: 'indented' | 'fenced'; // Default: 'indented'
108
fence?: string; // Default: '```'
109
emDelimiter?: '_' | '*'; // Default: '_'
110
strongDelimiter?: '**' | '__'; // Default: '**'
111
linkStyle?: 'inlined' | 'referenced'; // Default: 'inlined'
112
linkReferenceStyle?: 'full' | 'collapsed' | 'shortcut'; // Default: 'full'
113
br?: string; // Default: ' '
114
preformattedCode?: boolean; // Default: false
115
blankReplacement?: ReplacementFunction; // Custom replacement for blank elements
116
keepReplacement?: ReplacementFunction; // Custom replacement for kept elements
117
defaultReplacement?: ReplacementFunction; // Custom replacement for unrecognized elements
118
}
119
120
/**
121
* Replacement function signature for custom rules
122
*/
123
type ReplacementFunction = (content: string, node: HTMLElement, options: TurndownOptions) => string;
124
```
125
126
### Plugin System
127
128
Add custom functionality and extend conversion capabilities through plugins.
129
130
```javascript { .api }
131
/**
132
* Add one or more plugins to extend functionality
133
* @param {Function|Function[]} plugin - Plugin function or array of plugin functions
134
* @returns {TurndownService} TurndownService instance for chaining
135
*/
136
use(plugin)
137
```
138
139
### Element Control
140
141
Control which HTML elements are kept as HTML, removed entirely, or converted with custom rules.
142
143
```javascript { .api }
144
/**
145
* Keep specified elements as HTML in the output
146
* @param {string|string[]|Function} filter - Filter to match elements
147
* @returns {TurndownService} TurndownService instance for chaining
148
*/
149
keep(filter)
150
151
/**
152
* Remove specified elements entirely from output
153
* @param {string|string[]|Function} filter - Filter to match elements
154
* @returns {TurndownService} TurndownService instance for chaining
155
*/
156
remove(filter)
157
```
158
159
### Rule System
160
161
Extensible rule-based conversion system for customizing how HTML elements are converted to Markdown.
162
163
```javascript { .api }
164
/**
165
* Add a custom conversion rule
166
* @param {string} key - Unique identifier for the rule
167
* @param {Object} rule - Rule object with filter and replacement properties
168
* @returns {TurndownService} TurndownService instance for chaining
169
*/
170
addRule(key, rule)
171
172
/**
173
* Rule object structure
174
*/
175
interface Rule {
176
filter: string | string[] | Function; // Selector for HTML elements
177
replacement: Function; // Function to convert element to Markdown
178
}
179
```
180
181
[Rule System](./rules.md)
182
183
### Markdown Escaping
184
185
Utility for escaping Markdown special characters to prevent unwanted formatting.
186
187
```javascript { .api }
188
/**
189
* Escape Markdown special characters with backslashes
190
* @param {string} string - String to escape
191
* @returns {string} String with Markdown syntax escaped
192
*/
193
escape(string)
194
```
195
196
## TurndownService Class Interface
197
198
Complete interface definition for the TurndownService class.
199
200
```javascript { .api }
201
/**
202
* TurndownService class definition
203
*/
204
interface TurndownService {
205
/** Service configuration options */
206
options: TurndownOptions;
207
208
/** Rules collection instance */
209
rules: Rules;
210
211
/** Convert HTML to Markdown */
212
turndown(input: string | HTMLElement | Document | DocumentFragment): string;
213
214
/** Add one or more plugins */
215
use(plugin: Function | Function[]): TurndownService;
216
217
/** Add a custom conversion rule */
218
addRule(key: string, rule: Rule): TurndownService;
219
220
/** Keep elements as HTML */
221
keep(filter: string | string[] | Function): TurndownService;
222
223
/** Remove elements entirely */
224
remove(filter: string | string[] | Function): TurndownService;
225
226
/** Escape Markdown special characters */
227
escape(string: string): string;
228
}
229
230
/**
231
* Internal Rules class (used internally by TurndownService)
232
*/
233
interface Rules {
234
options: TurndownOptions;
235
array: Rule[];
236
blankRule: Rule;
237
keepReplacement: ReplacementFunction;
238
defaultRule: Rule;
239
240
add(key: string, rule: Rule): void;
241
keep(filter: string | string[] | Function): void;
242
remove(filter: string | string[] | Function): void;
243
forNode(node: HTMLElement): Rule;
244
forEach(fn: (rule: Rule, index: number) => void): void;
245
}
246
```
247
248
## Usage Examples
249
250
### Basic HTML Conversion
251
252
```javascript
253
const turndownService = new TurndownService();
254
255
// Convert HTML string
256
const markdown = turndownService.turndown('<p>Hello <strong>world</strong>!</p>');
257
// Result: "Hello **world**!"
258
259
// Convert DOM node
260
const element = document.getElementById('content');
261
const markdown = turndownService.turndown(element);
262
```
263
264
### Custom Configuration
265
266
```javascript
267
const turndownService = new TurndownService({
268
headingStyle: 'atx',
269
hr: '---',
270
bulletListMarker: '-',
271
codeBlockStyle: 'fenced',
272
fence: '~~~',
273
emDelimiter: '*',
274
strongDelimiter: '__',
275
linkStyle: 'referenced'
276
});
277
278
const html = `
279
<h1>Title</h1>
280
<ul>
281
<li>Item 1</li>
282
<li>Item 2</li>
283
</ul>
284
<pre><code>console.log('hello');</code></pre>
285
`;
286
287
const markdown = turndownService.turndown(html);
288
```
289
290
### Element Control
291
292
```javascript
293
const turndownService = new TurndownService();
294
295
// Keep certain elements as HTML
296
turndownService.keep(['del', 'ins']);
297
const result1 = turndownService.turndown('<p>Hello <del>world</del><ins>World</ins></p>');
298
// Result: "Hello <del>world</del><ins>World</ins>"
299
300
// Remove elements entirely
301
turndownService.remove('script');
302
const result2 = turndownService.turndown('<p>Content</p><script>alert("hi")</script>');
303
// Result: "Content"
304
```
305
306
### Plugin Usage
307
308
```javascript
309
// Define a plugin
310
function customPlugin(turndownService) {
311
turndownService.addRule('strikethrough', {
312
filter: ['del', 's', 'strike'],
313
replacement: function(content) {
314
return '~~' + content + '~~';
315
}
316
});
317
}
318
319
// Use the plugin
320
const turndownService = new TurndownService();
321
turndownService.use(customPlugin);
322
323
const result = turndownService.turndown('<p>This is <del>deleted</del> text</p>');
324
// Result: "This is ~~deleted~~ text"
325
```
326
327
## Error Handling
328
329
Turndown throws specific errors for invalid inputs:
330
331
- **TypeError**: When input to `turndown()` is not a string or valid DOM node
332
- **TypeError**: When plugin passed to `use()` is not a function or array of functions
333
- **TypeError**: When rule filter is not a string, array, or function
334
335
```javascript { .api }
336
/**
337
* Error types thrown by Turndown
338
*/
339
interface TurndownErrors {
340
/** Thrown when turndown() receives invalid input */
341
InvalidInputError: TypeError; // "{input} is not a string, or an element/document/fragment node."
342
343
/** Thrown when use() receives invalid plugin */
344
InvalidPluginError: TypeError; // "plugin must be a Function or an Array of Functions"
345
346
/** Thrown when rule filter is invalid */
347
InvalidFilterError: TypeError; // "`filter` needs to be a string, array, or function"
348
}
349
```
350
351
**Usage Examples:**
352
353
```javascript
354
const turndownService = new TurndownService();
355
356
// Invalid input to turndown()
357
try {
358
turndownService.turndown(null);
359
} catch (error) {
360
console.error(error.message); // "null is not a string, or an element/document/fragment node."
361
}
362
363
// Invalid plugin
364
try {
365
turndownService.use("invalid");
366
} catch (error) {
367
console.error(error.message); // "plugin must be a Function or an Array of Functions"
368
}
369
370
// Invalid rule filter
371
try {
372
turndownService.addRule('test', { filter: 123, replacement: () => '' });
373
} catch (error) {
374
console.error(error.message); // "`filter` needs to be a string, array, or function"
375
}
376
```
377
378
## Browser and Node.js Support
379
380
Turndown works in both browser and Node.js environments:
381
382
- **Browser**: Uses native DOMParser or fallback implementations (ActiveX for older IE)
383
- **Node.js**: Uses domino library for DOM parsing
384
- **Build Targets**: Available as CommonJS, ES modules, UMD, and IIFE formats