A very fast HTML parser, generating a simplified DOM, with basic element query support.
—
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Pending
The risk profile of this skill
Core HTML parsing functionality that converts HTML strings into manipulable DOM trees with comprehensive configuration options for different parsing scenarios.
Main parsing function that converts HTML strings to DOM trees with optional configuration.
/**
* Parses HTML and returns a root element containing the DOM tree
* @param data - HTML string to parse
* @param options - Optional parsing configuration
* @returns Root HTMLElement containing parsed DOM
*/
function parse(data: string, options?: Partial<Options>): HTMLElement;Usage Examples:
import { parse } from "node-html-parser";
// Basic parsing
const root = parse('<div>Hello World</div>');
// With parsing options
const root = parse('<div>Content</div>', {
lowerCaseTagName: true,
comment: true,
voidTag: {
closingSlash: true
}
});
// Parse complex HTML
const html = `
<html>
<head><title>Test</title></head>
<body>
<div class="container">
<p>Paragraph content</p>
<!-- This is a comment -->
</div>
</body>
</html>`;
const document = parse(html, { comment: true });Validates if HTML string parses to a single root element.
/**
* Validates HTML structure by checking if it parses to single root
* @param data - HTML string to validate
* @param options - Optional parsing configuration
* @returns true if HTML is valid (single root), false otherwise
*/
function valid(data: string, options?: Partial<Options>): boolean;Usage Examples:
import { valid } from "node-html-parser";
// Valid HTML (single root)
console.log(valid('<div><p>Content</p></div>')); // true
// Invalid HTML (multiple roots)
console.log(valid('<div>First</div><div>Second</div>')); // false
// With options
console.log(valid('<DIV>Content</DIV>', { lowerCaseTagName: true })); // trueComprehensive configuration interface for customizing parsing behavior.
interface Options {
/** Convert all tag names to lowercase */
lowerCaseTagName?: boolean;
/** Parse and include comment nodes in the DOM tree */
comment?: boolean;
/** Fix nested anchor tags by properly closing them */
fixNestedATags?: boolean;
/** Parse tags that don't have closing tags */
parseNoneClosedTags?: boolean;
/** Define which elements should preserve their text content as-is */
blockTextElements?: { [tag: string]: boolean };
/** Void element configuration */
voidTag?: {
/** Custom list of void elements (defaults to HTML5 void elements) */
tags?: string[];
/** Add closing slash to void elements (e.g., <br/>) */
closingSlash?: boolean;
};
}Default Values:
// Default blockTextElements (when not specified)
{
script: true,
noscript: true,
style: true,
pre: true
}
// Default void elements (HTML5 standard)
['area', 'base', 'br', 'col', 'embed', 'hr', 'img', 'input', 'link', 'meta', 'param', 'source', 'track', 'wbr']Configuration Examples:
import { parse } from "node-html-parser";
// Preserve original case
const root = parse('<DIV>Content</DIV>', {
lowerCaseTagName: false
});
// Include comments in parsing
const withComments = parse('<!-- comment --><div>content</div>', {
comment: true
});
// Custom void elements with closing slashes
const customVoid = parse('<custom-void></custom-void>', {
voidTag: {
tags: ['custom-void'],
closingSlash: true
}
});
// Custom block text elements
const customBlocks = parse('<code>preserved content</code>', {
blockTextElements: {
code: true,
pre: true
}
});The parse function exposes additional utilities as static properties:
// Access to internal classes and utilities
parse.HTMLElement: typeof HTMLElement;
parse.Node: typeof Node;
parse.TextNode: typeof TextNode;
parse.CommentNode: typeof CommentNode;
parse.NodeType: typeof NodeType;
parse.valid: typeof valid;
parse.parse: typeof baseParse; // Internal parsing functionUsage:
import { parse } from "node-html-parser";
// Create elements directly
const element = new parse.HTMLElement('div', {}, '');
// Check node types
if (node.nodeType === parse.NodeType.ELEMENT_NODE) {
// Handle element node
}
// Use validation
const isValid = parse.valid('<div>content</div>');