or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/npm-html-entities

High-performance HTML entities encoding and decoding library for JavaScript/TypeScript applications.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/html-entities@2.6.x

To install, run

npx @tessl/cli install tessl/npm-html-entities@2.6.0

0

# HTML Entities

1

2

HTML Entities is a high-performance library for encoding and decoding HTML entities in JavaScript and TypeScript applications. It provides comprehensive entity handling with support for HTML5, HTML4, and XML standards, featuring multiple encoding modes and decoding scopes to match browser behavior.

3

4

## Package Information

5

6

- **Package Name**: html-entities

7

- **Package Type**: npm

8

- **Language**: TypeScript

9

- **Installation**: `npm install html-entities`

10

11

## Core Imports

12

13

```typescript

14

import { encode, decode, decodeEntity } from "html-entities";

15

import type { EncodeOptions, DecodeOptions, Level, EncodeMode, DecodeScope } from "html-entities";

16

```

17

18

For CommonJS:

19

20

```javascript

21

const { encode, decode, decodeEntity } = require("html-entities");

22

```

23

24

## Basic Usage

25

26

```typescript

27

import { encode, decode, decodeEntity } from "html-entities";

28

29

// Basic encoding - HTML special characters only

30

const encoded = encode('< > " \' & © ∆');

31

// Result: '&lt; &gt; &quot; &apos; &amp; © ∆'

32

33

// Basic decoding - all entities

34

const decoded = decode('&lt; &gt; &quot; &apos; &amp; &#169; &#8710;');

35

// Result: '< > " \' & © ∆'

36

37

// Single entity decoding

38

const singleDecoded = decodeEntity('&lt;');

39

// Result: '<'

40

41

// Advanced encoding with options

42

const advancedEncoded = encode('< ©', { mode: 'nonAsciiPrintable' });

43

// Result: '&lt; &copy;'

44

45

// XML-specific encoding

46

const xmlEncoded = encode('< ©', { mode: 'nonAsciiPrintable', level: 'xml' });

47

// Result: '&lt; &#169;'

48

```

49

50

## Architecture

51

52

HTML Entities is built around three core functions with extensive configuration options:

53

54

- **encode()**: Converts characters to HTML entities with configurable encoding modes and entity levels

55

- **decode()**: Converts HTML entities back to characters with configurable scopes and levels

56

- **decodeEntity()**: Handles individual entity decoding with level configuration

57

- **Type System**: Complete TypeScript definitions with strict typing for all options and return values

58

- **Performance Optimization**: Pre-compiled regular expressions and entity mappings for maximum speed

59

60

## Capabilities

61

62

### HTML Entity Encoding

63

64

Encodes text by replacing characters with their corresponding HTML entities. Supports multiple encoding modes from basic HTML special characters to comprehensive character encoding.

65

66

```typescript { .api }

67

/**

68

* Encodes all the necessary (specified by level) characters in the text

69

* @param text - Text to encode (supports null/undefined, returns empty string)

70

* @param options - Encoding configuration options

71

* @returns Encoded text with HTML entities

72

*/

73

function encode(

74

text: string | undefined | null,

75

options?: EncodeOptions

76

): string;

77

78

interface EncodeOptions {

79

/** Encoding mode - determines which characters to encode */

80

mode?: EncodeMode;

81

/** Numeric format for character codes */

82

numeric?: 'decimal' | 'hexadecimal';

83

/** Entity level/standard to use */

84

level?: Level;

85

}

86

87

type EncodeMode =

88

| 'specialChars' // Only HTML special characters (<>&"')

89

| 'nonAscii' // Special chars + all non-ASCII characters

90

| 'nonAsciiPrintable' // Special chars + non-ASCII + non-printable ASCII

91

| 'nonAsciiPrintableOnly' // Only non-ASCII printable (keeps HTML special chars intact)

92

| 'extensive'; // All non-printable, non-ASCII, and characters with named references

93

94

type Level = 'xml' | 'html4' | 'html5' | 'all';

95

```

96

97

**Usage Examples:**

98

99

```typescript

100

import { encode } from "html-entities";

101

102

// Basic HTML escaping

103

const basic = encode('<script>alert("XSS")</script>');

104

// Result: '&lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;'

105

106

// Non-ASCII character encoding

107

const nonAscii = encode('Hello 世界 © 2023', { mode: 'nonAscii' });

108

// Result: 'Hello &#19990;&#30028; &copy; 2023'

109

110

// XML-only entities

111

const xmlOnly = encode('< > " \' &', { level: 'xml' });

112

// Result: '&lt; &gt; &quot; &apos; &amp;'

113

114

// Hexadecimal numeric entities

115

const hexEntities = encode('© ∆', { mode: 'nonAscii', numeric: 'hexadecimal' });

116

// Result: '&#xa9; &#x2206;'

117

```

118

119

### HTML Entity Decoding

120

121

Decodes HTML entities back to their original characters. Supports different decoding scopes to match browser parsing behavior in different contexts.

122

123

```typescript { .api }

124

/**

125

* Decodes all entities in the text

126

* @param text - Text containing HTML entities to decode

127

* @param options - Decoding configuration options

128

* @returns Decoded text with entities converted to characters

129

*/

130

function decode(

131

text: string | undefined | null,

132

options?: DecodeOptions

133

): string;

134

135

interface DecodeOptions {

136

/** Entity level/standard to recognize */

137

level?: Level;

138

/** Decoding scope - affects handling of entities without semicolons */

139

scope?: DecodeScope;

140

}

141

142

type DecodeScope =

143

| 'body' // Browser behavior in tag bodies (entities without semicolon replaced)

144

| 'attribute' // Browser behavior in attributes (entities without semicolon when not followed by =)

145

| 'strict'; // Only entities with semicolons

146

```

147

148

**Usage Examples:**

149

150

```typescript

151

import { decode } from "html-entities";

152

153

// Basic decoding

154

const basic = decode('&lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;');

155

// Result: '<script>alert("XSS")</script>'

156

157

// Mixed named and numeric entities

158

const mixed = decode('&copy; &#169; &#xa9; &#8710;');

159

// Result: '© © © ∆'

160

161

// Strict decoding - only entities with semicolons

162

const strict = decode('&lt &gt;', { scope: 'strict' });

163

// Result: '&lt &gt;' (unchanged - no semicolons)

164

165

// Body scope - entities without semicolons decoded

166

const body = decode('&lt &gt');

167

// Result: '< >' (decoded despite missing semicolons)

168

169

// XML level - only XML entities recognized

170

const xmlLevel = decode('&copy; &lt; &amp;', { level: 'xml' });

171

// Result: '&copy; < &' (copyright not decoded in XML)

172

```

173

174

### Single Entity Decoding

175

176

Decodes individual HTML entities, useful for processing single entities or building custom decoders.

177

178

```typescript { .api }

179

/**

180

* Decodes a single entity

181

* @param entity - Single HTML entity to decode (e.g., '&lt;', '&#169;')

182

* @param options - Decoding configuration options

183

* @returns Decoded character or original entity if unknown

184

*/

185

function decodeEntity(

186

entity: string | undefined | null,

187

options?: CommonOptions

188

): string;

189

190

interface CommonOptions {

191

/** Entity level/standard to use for recognition */

192

level?: Level;

193

}

194

```

195

196

**Usage Examples:**

197

198

```typescript

199

import { decodeEntity } from "html-entities";

200

201

// Named entity decoding

202

const named = decodeEntity('&lt;');

203

// Result: '<'

204

205

// Numeric entity decoding

206

const numeric = decodeEntity('&#169;');

207

// Result: '©'

208

209

// Hexadecimal entity decoding

210

const hex = decodeEntity('&#xa9;');

211

// Result: '©'

212

213

// Unknown entity (left unchanged)

214

const unknown = decodeEntity('&unknownentity;');

215

// Result: '&unknownentity;'

216

217

// Level-specific decoding

218

const xmlOnly = decodeEntity('&copy;', { level: 'xml' });

219

// Result: '&copy;' (unchanged - not an XML entity)

220

221

const htmlDecoded = decodeEntity('&copy;', { level: 'html5' });

222

// Result: '©'

223

```

224

225

## Types

226

227

```typescript { .api }

228

// Main configuration types

229

type Level = 'xml' | 'html4' | 'html5' | 'all';

230

231

type EncodeMode =

232

| 'specialChars' // HTML special characters only: <>&"'

233

| 'nonAscii' // Special chars + non-ASCII characters

234

| 'nonAsciiPrintable' // Special chars + non-ASCII + non-printable ASCII

235

| 'nonAsciiPrintableOnly' // Non-ASCII printable only (preserves HTML special chars)

236

| 'extensive'; // Comprehensive encoding of special characters

237

238

type DecodeScope =

239

| 'strict' // Only entities ending with semicolon

240

| 'body' // Browser body parsing (loose semicolon handling)

241

| 'attribute'; // Browser attribute parsing (semicolon handling with = check)

242

243

// Options interfaces

244

interface EncodeOptions {

245

mode?: EncodeMode; // Default: 'specialChars'

246

numeric?: 'decimal' | 'hexadecimal'; // Default: 'decimal'

247

level?: Level; // Default: 'all'

248

}

249

250

interface DecodeOptions {

251

level?: Level; // Default: 'all'

252

scope?: DecodeScope; // Default: 'body' (or 'strict' for XML level)

253

}

254

255

interface CommonOptions {

256

level?: Level; // Default: 'all'

257

}

258

```

259

260

## Error Handling

261

262

- **Null/undefined inputs**: Return empty string

263

- **Unknown entities**: Left unchanged during decoding

264

- **Invalid numeric entities**: Return Unicode replacement character (�) for out-of-bounds values

265

- **Numeric overflow**: Values >= 0x10FFFF return replacement character

266

- **Surrogate pairs**: Properly handled for characters > 65535

267

- **Malformed entities**: Invalid syntax left unchanged