High-performance HTML entities encoding and decoding library for JavaScript/TypeScript applications.
npx @tessl/cli install tessl/npm-html-entities@2.6.00
# HTML Entities
1
2
HTML Entities is a high-performance library for encoding and decoding HTML entities in JavaScript and TypeScript applications. It provides comprehensive entity handling with support for HTML5, HTML4, and XML standards, featuring multiple encoding modes and decoding scopes to match browser behavior.
3
4
## Package Information
5
6
- **Package Name**: html-entities
7
- **Package Type**: npm
8
- **Language**: TypeScript
9
- **Installation**: `npm install html-entities`
10
11
## Core Imports
12
13
```typescript
14
import { encode, decode, decodeEntity } from "html-entities";
15
import type { EncodeOptions, DecodeOptions, Level, EncodeMode, DecodeScope } from "html-entities";
16
```
17
18
For CommonJS:
19
20
```javascript
21
const { encode, decode, decodeEntity } = require("html-entities");
22
```
23
24
## Basic Usage
25
26
```typescript
27
import { encode, decode, decodeEntity } from "html-entities";
28
29
// Basic encoding - HTML special characters only
30
const encoded = encode('< > " \' & © ∆');
31
// Result: '< > " ' & © ∆'
32
33
// Basic decoding - all entities
34
const decoded = decode('< > " ' & © ∆');
35
// Result: '< > " \' & © ∆'
36
37
// Single entity decoding
38
const singleDecoded = decodeEntity('<');
39
// Result: '<'
40
41
// Advanced encoding with options
42
const advancedEncoded = encode('< ©', { mode: 'nonAsciiPrintable' });
43
// Result: '< ©'
44
45
// XML-specific encoding
46
const xmlEncoded = encode('< ©', { mode: 'nonAsciiPrintable', level: 'xml' });
47
// Result: '< ©'
48
```
49
50
## Architecture
51
52
HTML Entities is built around three core functions with extensive configuration options:
53
54
- **encode()**: Converts characters to HTML entities with configurable encoding modes and entity levels
55
- **decode()**: Converts HTML entities back to characters with configurable scopes and levels
56
- **decodeEntity()**: Handles individual entity decoding with level configuration
57
- **Type System**: Complete TypeScript definitions with strict typing for all options and return values
58
- **Performance Optimization**: Pre-compiled regular expressions and entity mappings for maximum speed
59
60
## Capabilities
61
62
### HTML Entity Encoding
63
64
Encodes text by replacing characters with their corresponding HTML entities. Supports multiple encoding modes from basic HTML special characters to comprehensive character encoding.
65
66
```typescript { .api }
67
/**
68
* Encodes all the necessary (specified by level) characters in the text
69
* @param text - Text to encode (supports null/undefined, returns empty string)
70
* @param options - Encoding configuration options
71
* @returns Encoded text with HTML entities
72
*/
73
function encode(
74
text: string | undefined | null,
75
options?: EncodeOptions
76
): string;
77
78
interface EncodeOptions {
79
/** Encoding mode - determines which characters to encode */
80
mode?: EncodeMode;
81
/** Numeric format for character codes */
82
numeric?: 'decimal' | 'hexadecimal';
83
/** Entity level/standard to use */
84
level?: Level;
85
}
86
87
type EncodeMode =
88
| 'specialChars' // Only HTML special characters (<>&"')
89
| 'nonAscii' // Special chars + all non-ASCII characters
90
| 'nonAsciiPrintable' // Special chars + non-ASCII + non-printable ASCII
91
| 'nonAsciiPrintableOnly' // Only non-ASCII printable (keeps HTML special chars intact)
92
| 'extensive'; // All non-printable, non-ASCII, and characters with named references
93
94
type Level = 'xml' | 'html4' | 'html5' | 'all';
95
```
96
97
**Usage Examples:**
98
99
```typescript
100
import { encode } from "html-entities";
101
102
// Basic HTML escaping
103
const basic = encode('<script>alert("XSS")</script>');
104
// Result: '<script>alert("XSS")</script>'
105
106
// Non-ASCII character encoding
107
const nonAscii = encode('Hello 世界 © 2023', { mode: 'nonAscii' });
108
// Result: 'Hello 世界 © 2023'
109
110
// XML-only entities
111
const xmlOnly = encode('< > " \' &', { level: 'xml' });
112
// Result: '< > " ' &'
113
114
// Hexadecimal numeric entities
115
const hexEntities = encode('© ∆', { mode: 'nonAscii', numeric: 'hexadecimal' });
116
// Result: '© ∆'
117
```
118
119
### HTML Entity Decoding
120
121
Decodes HTML entities back to their original characters. Supports different decoding scopes to match browser parsing behavior in different contexts.
122
123
```typescript { .api }
124
/**
125
* Decodes all entities in the text
126
* @param text - Text containing HTML entities to decode
127
* @param options - Decoding configuration options
128
* @returns Decoded text with entities converted to characters
129
*/
130
function decode(
131
text: string | undefined | null,
132
options?: DecodeOptions
133
): string;
134
135
interface DecodeOptions {
136
/** Entity level/standard to recognize */
137
level?: Level;
138
/** Decoding scope - affects handling of entities without semicolons */
139
scope?: DecodeScope;
140
}
141
142
type DecodeScope =
143
| 'body' // Browser behavior in tag bodies (entities without semicolon replaced)
144
| 'attribute' // Browser behavior in attributes (entities without semicolon when not followed by =)
145
| 'strict'; // Only entities with semicolons
146
```
147
148
**Usage Examples:**
149
150
```typescript
151
import { decode } from "html-entities";
152
153
// Basic decoding
154
const basic = decode('<script>alert("XSS")</script>');
155
// Result: '<script>alert("XSS")</script>'
156
157
// Mixed named and numeric entities
158
const mixed = decode('© © © ∆');
159
// Result: '© © © ∆'
160
161
// Strict decoding - only entities with semicolons
162
const strict = decode('< >', { scope: 'strict' });
163
// Result: '< >' (unchanged - no semicolons)
164
165
// Body scope - entities without semicolons decoded
166
const body = decode('< >');
167
// Result: '< >' (decoded despite missing semicolons)
168
169
// XML level - only XML entities recognized
170
const xmlLevel = decode('© < &', { level: 'xml' });
171
// Result: '© < &' (copyright not decoded in XML)
172
```
173
174
### Single Entity Decoding
175
176
Decodes individual HTML entities, useful for processing single entities or building custom decoders.
177
178
```typescript { .api }
179
/**
180
* Decodes a single entity
181
* @param entity - Single HTML entity to decode (e.g., '<', '©')
182
* @param options - Decoding configuration options
183
* @returns Decoded character or original entity if unknown
184
*/
185
function decodeEntity(
186
entity: string | undefined | null,
187
options?: CommonOptions
188
): string;
189
190
interface CommonOptions {
191
/** Entity level/standard to use for recognition */
192
level?: Level;
193
}
194
```
195
196
**Usage Examples:**
197
198
```typescript
199
import { decodeEntity } from "html-entities";
200
201
// Named entity decoding
202
const named = decodeEntity('<');
203
// Result: '<'
204
205
// Numeric entity decoding
206
const numeric = decodeEntity('©');
207
// Result: '©'
208
209
// Hexadecimal entity decoding
210
const hex = decodeEntity('©');
211
// Result: '©'
212
213
// Unknown entity (left unchanged)
214
const unknown = decodeEntity('&unknownentity;');
215
// Result: '&unknownentity;'
216
217
// Level-specific decoding
218
const xmlOnly = decodeEntity('©', { level: 'xml' });
219
// Result: '©' (unchanged - not an XML entity)
220
221
const htmlDecoded = decodeEntity('©', { level: 'html5' });
222
// Result: '©'
223
```
224
225
## Types
226
227
```typescript { .api }
228
// Main configuration types
229
type Level = 'xml' | 'html4' | 'html5' | 'all';
230
231
type EncodeMode =
232
| 'specialChars' // HTML special characters only: <>&"'
233
| 'nonAscii' // Special chars + non-ASCII characters
234
| 'nonAsciiPrintable' // Special chars + non-ASCII + non-printable ASCII
235
| 'nonAsciiPrintableOnly' // Non-ASCII printable only (preserves HTML special chars)
236
| 'extensive'; // Comprehensive encoding of special characters
237
238
type DecodeScope =
239
| 'strict' // Only entities ending with semicolon
240
| 'body' // Browser body parsing (loose semicolon handling)
241
| 'attribute'; // Browser attribute parsing (semicolon handling with = check)
242
243
// Options interfaces
244
interface EncodeOptions {
245
mode?: EncodeMode; // Default: 'specialChars'
246
numeric?: 'decimal' | 'hexadecimal'; // Default: 'decimal'
247
level?: Level; // Default: 'all'
248
}
249
250
interface DecodeOptions {
251
level?: Level; // Default: 'all'
252
scope?: DecodeScope; // Default: 'body' (or 'strict' for XML level)
253
}
254
255
interface CommonOptions {
256
level?: Level; // Default: 'all'
257
}
258
```
259
260
## Error Handling
261
262
- **Null/undefined inputs**: Return empty string
263
- **Unknown entities**: Left unchanged during decoding
264
- **Invalid numeric entities**: Return Unicode replacement character (�) for out-of-bounds values
265
- **Numeric overflow**: Values >= 0x10FFFF return replacement character
266
- **Surrogate pairs**: Properly handled for characters > 65535
267
- **Malformed entities**: Invalid syntax left unchanged