A library that converts HTML to Markdown
npx @tessl/cli install tessl/npm-turndown@7.2.0Turndown is a JavaScript library that converts HTML to Markdown. It provides a configurable service with extensive customization options for heading styles, list markers, code block formatting, link styles, and text emphasis delimiters. The library features a rule-based conversion system and plugin architecture for extending functionality.
npm install turndown// CommonJS (Node.js) - Primary import method
const TurndownService = require('turndown');
// ES Modules (if using bundler that supports it)
import TurndownService from 'turndown';Browser usage:
<script src="https://unpkg.com/turndown/dist/turndown.js"></script>
<!-- TurndownService is available as a global -->UMD usage:
// RequireJS
define(['turndown'], function(TurndownService) {
// Use TurndownService
});const TurndownService = require('turndown');
const turndownService = new TurndownService();
const markdown = turndownService.turndown('<h1>Hello world!</h1>');
console.log(markdown); // "Hello world\n==========="
// With options
const turndownService = new TurndownService({
headingStyle: 'atx',
codeBlockStyle: 'fenced'
});
const html = '<h2>Example</h2><p>Convert <strong>HTML</strong> to <em>Markdown</em></p>';
const markdown = turndownService.turndown(html);Turndown is built around several key components:
Creates a new TurndownService instance with optional configuration.
/**
* TurndownService constructor
* @param {TurndownOptions} options - Optional configuration object
* @returns {TurndownService} New TurndownService instance
*/
function TurndownService(options)
// Can also be called without 'new'
const turndownService = TurndownService(options);Core HTML to Markdown conversion functionality with support for all standard HTML elements and DOM nodes.
/**
* Convert HTML string or DOM node to Markdown
* @param {string|HTMLElement|Document|DocumentFragment} input - HTML to convert
* @returns {string} Markdown representation of the input
*/
turndown(input)Comprehensive configuration system for customizing Markdown output format and style.
/**
* TurndownService constructor options
*/
interface TurndownOptions {
headingStyle?: 'setext' | 'atx'; // Default: 'setext'
hr?: string; // Default: '* * *'
bulletListMarker?: '*' | '-' | '+'; // Default: '*'
codeBlockStyle?: 'indented' | 'fenced'; // Default: 'indented'
fence?: string; // Default: '```'
emDelimiter?: '_' | '*'; // Default: '_'
strongDelimiter?: '**' | '__'; // Default: '**'
linkStyle?: 'inlined' | 'referenced'; // Default: 'inlined'
linkReferenceStyle?: 'full' | 'collapsed' | 'shortcut'; // Default: 'full'
br?: string; // Default: ' '
preformattedCode?: boolean; // Default: false
blankReplacement?: ReplacementFunction; // Custom replacement for blank elements
keepReplacement?: ReplacementFunction; // Custom replacement for kept elements
defaultReplacement?: ReplacementFunction; // Custom replacement for unrecognized elements
}
/**
* Replacement function signature for custom rules
*/
type ReplacementFunction = (content: string, node: HTMLElement, options: TurndownOptions) => string;Add custom functionality and extend conversion capabilities through plugins.
/**
* Add one or more plugins to extend functionality
* @param {Function|Function[]} plugin - Plugin function or array of plugin functions
* @returns {TurndownService} TurndownService instance for chaining
*/
use(plugin)Control which HTML elements are kept as HTML, removed entirely, or converted with custom rules.
/**
* Keep specified elements as HTML in the output
* @param {string|string[]|Function} filter - Filter to match elements
* @returns {TurndownService} TurndownService instance for chaining
*/
keep(filter)
/**
* Remove specified elements entirely from output
* @param {string|string[]|Function} filter - Filter to match elements
* @returns {TurndownService} TurndownService instance for chaining
*/
remove(filter)Extensible rule-based conversion system for customizing how HTML elements are converted to Markdown.
/**
* Add a custom conversion rule
* @param {string} key - Unique identifier for the rule
* @param {Object} rule - Rule object with filter and replacement properties
* @returns {TurndownService} TurndownService instance for chaining
*/
addRule(key, rule)
/**
* Rule object structure
*/
interface Rule {
filter: string | string[] | Function; // Selector for HTML elements
replacement: Function; // Function to convert element to Markdown
}Utility for escaping Markdown special characters to prevent unwanted formatting.
/**
* Escape Markdown special characters with backslashes
* @param {string} string - String to escape
* @returns {string} String with Markdown syntax escaped
*/
escape(string)Complete interface definition for the TurndownService class.
/**
* TurndownService class definition
*/
interface TurndownService {
/** Service configuration options */
options: TurndownOptions;
/** Rules collection instance */
rules: Rules;
/** Convert HTML to Markdown */
turndown(input: string | HTMLElement | Document | DocumentFragment): string;
/** Add one or more plugins */
use(plugin: Function | Function[]): TurndownService;
/** Add a custom conversion rule */
addRule(key: string, rule: Rule): TurndownService;
/** Keep elements as HTML */
keep(filter: string | string[] | Function): TurndownService;
/** Remove elements entirely */
remove(filter: string | string[] | Function): TurndownService;
/** Escape Markdown special characters */
escape(string: string): string;
}
/**
* Internal Rules class (used internally by TurndownService)
*/
interface Rules {
options: TurndownOptions;
array: Rule[];
blankRule: Rule;
keepReplacement: ReplacementFunction;
defaultRule: Rule;
add(key: string, rule: Rule): void;
keep(filter: string | string[] | Function): void;
remove(filter: string | string[] | Function): void;
forNode(node: HTMLElement): Rule;
forEach(fn: (rule: Rule, index: number) => void): void;
}const turndownService = new TurndownService();
// Convert HTML string
const markdown = turndownService.turndown('<p>Hello <strong>world</strong>!</p>');
// Result: "Hello **world**!"
// Convert DOM node
const element = document.getElementById('content');
const markdown = turndownService.turndown(element);const turndownService = new TurndownService({
headingStyle: 'atx',
hr: '---',
bulletListMarker: '-',
codeBlockStyle: 'fenced',
fence: '~~~',
emDelimiter: '*',
strongDelimiter: '__',
linkStyle: 'referenced'
});
const html = `
<h1>Title</h1>
<ul>
<li>Item 1</li>
<li>Item 2</li>
</ul>
<pre><code>console.log('hello');</code></pre>
`;
const markdown = turndownService.turndown(html);const turndownService = new TurndownService();
// Keep certain elements as HTML
turndownService.keep(['del', 'ins']);
const result1 = turndownService.turndown('<p>Hello <del>world</del><ins>World</ins></p>');
// Result: "Hello <del>world</del><ins>World</ins>"
// Remove elements entirely
turndownService.remove('script');
const result2 = turndownService.turndown('<p>Content</p><script>alert("hi")</script>');
// Result: "Content"// Define a plugin
function customPlugin(turndownService) {
turndownService.addRule('strikethrough', {
filter: ['del', 's', 'strike'],
replacement: function(content) {
return '~~' + content + '~~';
}
});
}
// Use the plugin
const turndownService = new TurndownService();
turndownService.use(customPlugin);
const result = turndownService.turndown('<p>This is <del>deleted</del> text</p>');
// Result: "This is ~~deleted~~ text"Turndown throws specific errors for invalid inputs:
turndown() is not a string or valid DOM nodeuse() is not a function or array of functions/**
* Error types thrown by Turndown
*/
interface TurndownErrors {
/** Thrown when turndown() receives invalid input */
InvalidInputError: TypeError; // "{input} is not a string, or an element/document/fragment node."
/** Thrown when use() receives invalid plugin */
InvalidPluginError: TypeError; // "plugin must be a Function or an Array of Functions"
/** Thrown when rule filter is invalid */
InvalidFilterError: TypeError; // "`filter` needs to be a string, array, or function"
}Usage Examples:
const turndownService = new TurndownService();
// Invalid input to turndown()
try {
turndownService.turndown(null);
} catch (error) {
console.error(error.message); // "null is not a string, or an element/document/fragment node."
}
// Invalid plugin
try {
turndownService.use("invalid");
} catch (error) {
console.error(error.message); // "plugin must be a Function or an Array of Functions"
}
// Invalid rule filter
try {
turndownService.addRule('test', { filter: 123, replacement: () => '' });
} catch (error) {
console.error(error.message); // "`filter` needs to be a string, array, or function"
}Turndown works in both browser and Node.js environments: