or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/npm-node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/node-html-parser@7.0.x

To install, run

npx @tessl/cli install tessl/npm-node-html-parser@7.0.0

0

# Node HTML Parser

1

2

Node HTML Parser is a very fast HTML parser that generates a simplified DOM tree with comprehensive element query support. Designed for high performance when processing large HTML files, it offers a complete API for parsing HTML strings, querying elements using CSS selectors, manipulating DOM structures, and serializing back to HTML.

3

4

## Package Information

5

6

- **Package Name**: node-html-parser

7

- **Package Type**: npm

8

- **Language**: TypeScript/JavaScript

9

- **Installation**: `npm install node-html-parser`

10

11

## Core Imports

12

13

```typescript

14

import { parse } from "node-html-parser";

15

```

16

17

For named imports:

18

19

```typescript

20

import { parse, HTMLElement, TextNode, CommentNode, NodeType, valid } from "node-html-parser";

21

```

22

23

For CommonJS:

24

25

```javascript

26

const { parse } = require("node-html-parser");

27

```

28

29

## Basic Usage

30

31

```typescript

32

import { parse } from "node-html-parser";

33

34

// Parse HTML string

35

const root = parse('<ul id="list"><li>Hello World</li></ul>');

36

37

// Query elements

38

const listItem = root.querySelector('li');

39

console.log(listItem.text); // "Hello World"

40

41

// Manipulate DOM

42

const newLi = parse('<li>New Item</li>');

43

root.appendChild(newLi);

44

45

// Access attributes

46

const list = root.querySelector('#list');

47

console.log(list.id); // "list"

48

49

// Convert back to HTML

50

console.log(root.toString());

51

```

52

53

## Architecture

54

55

Node HTML Parser is built around several key components:

56

57

- **Parse Function**: Main entry point for converting HTML strings to DOM trees

58

- **DOM Classes**: HTMLElement, TextNode, and CommentNode classes providing web-standard APIs

59

- **Query Engine**: CSS selector support via css-select integration for powerful element queries

60

- **Performance Focus**: Optimized for speed over strict HTML specification compliance

61

- **Simplified DOM**: Lightweight DOM structure for efficient processing of large HTML files

62

63

## Capabilities

64

65

### HTML Parsing

66

67

Core HTML parsing functionality that converts HTML strings into manipulable DOM trees with configurable parsing options.

68

69

```typescript { .api }

70

function parse(data: string, options?: Partial<Options>): HTMLElement;

71

```

72

73

[HTML Parsing](./parsing.md)

74

75

### DOM Elements

76

77

Complete HTMLElement implementation with DOM manipulation methods, property access, and web-standard APIs for content modification.

78

79

```typescript { .api }

80

class HTMLElement extends Node {

81

// Properties

82

tagName: string;

83

id: string;

84

classList: DOMTokenList;

85

innerHTML: string;

86

textContent: string;

87

88

// Methods

89

appendChild<T extends Node>(node: T): T;

90

querySelector(selector: string): HTMLElement | null;

91

getAttribute(key: string): string | undefined;

92

setAttribute(key: string, value: string): HTMLElement;

93

}

94

```

95

96

[DOM Elements](./dom-elements.md)

97

98

### Node Types

99

100

Base Node classes and node type system including TextNode and CommentNode for complete DOM tree representation.

101

102

```typescript { .api }

103

abstract class Node {

104

childNodes: Node[];

105

parentNode: HTMLElement | null;

106

textContent: string;

107

remove(): Node;

108

}

109

110

class TextNode extends Node {

111

text: string;

112

rawText: string;

113

isWhitespace: boolean;

114

}

115

116

class CommentNode extends Node {

117

rawText: string;

118

}

119

120

enum NodeType {

121

ELEMENT_NODE = 1,

122

TEXT_NODE = 3,

123

COMMENT_NODE = 8

124

}

125

```

126

127

[Node Types](./node-types.md)

128

129

### Query & Selection

130

131

Powerful element querying capabilities using CSS selectors, tag names, IDs, and DOM traversal methods.

132

133

```typescript { .api }

134

// CSS selector queries

135

querySelector(selector: string): HTMLElement | null;

136

querySelectorAll(selector: string): HTMLElement[];

137

138

// Element queries

139

getElementsByTagName(tagName: string): HTMLElement[];

140

getElementById(id: string): HTMLElement | null;

141

closest(selector: string): HTMLElement | null;

142

```

143

144

[Query & Selection](./query-selection.md)

145

146

### Attributes & Properties

147

148

Comprehensive attribute manipulation and property access with support for both raw and decoded attribute values.

149

150

```typescript { .api }

151

// Attribute methods

152

getAttribute(key: string): string | undefined;

153

setAttribute(key: string, value: string): HTMLElement;

154

removeAttribute(key: string): HTMLElement;

155

hasAttribute(key: string): boolean;

156

157

// Property access

158

get attributes(): Record<string, string>;

159

get rawAttributes(): RawAttributes;

160

get classList(): DOMTokenList;

161

```

162

163

[Attributes & Properties](./attributes-properties.md)

164

165

## Types

166

167

```typescript { .api }

168

interface Options {

169

lowerCaseTagName?: boolean;

170

comment?: boolean;

171

fixNestedATags?: boolean;

172

parseNoneClosedTags?: boolean;

173

blockTextElements?: { [tag: string]: boolean };

174

voidTag?: {

175

tags?: string[];

176

closingSlash?: boolean;

177

};

178

}

179

180

interface Attributes {

181

[key: string]: string;

182

}

183

184

type InsertPosition = 'beforebegin' | 'afterbegin' | 'beforeend' | 'afterend';

185

type NodeInsertable = Node | string;

186

```