or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/npm-html-entities

High-performance HTML entities encoding and decoding library for JavaScript/TypeScript applications.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/html-entities@2.6.x

To install, run

npx @tessl/cli install tessl/npm-html-entities@2.6.0

index.mddocs/

HTML Entities

HTML Entities is a high-performance library for encoding and decoding HTML entities in JavaScript and TypeScript applications. It provides comprehensive entity handling with support for HTML5, HTML4, and XML standards, featuring multiple encoding modes and decoding scopes to match browser behavior.

Package Information

  • Package Name: html-entities
  • Package Type: npm
  • Language: TypeScript
  • Installation:
    npm install html-entities

Core Imports

import { encode, decode, decodeEntity } from "html-entities";
import type { EncodeOptions, DecodeOptions, Level, EncodeMode, DecodeScope } from "html-entities";

For CommonJS:

const { encode, decode, decodeEntity } = require("html-entities");

Basic Usage

import { encode, decode, decodeEntity } from "html-entities";

// Basic encoding - HTML special characters only
const encoded = encode('< > " \' & © ∆');
// Result: '&lt; &gt; &quot; &apos; &amp; © ∆'

// Basic decoding - all entities
const decoded = decode('&lt; &gt; &quot; &apos; &amp; &#169; &#8710;');
// Result: '< > " \' & © ∆'

// Single entity decoding
const singleDecoded = decodeEntity('&lt;');
// Result: '<'

// Advanced encoding with options
const advancedEncoded = encode('< ©', { mode: 'nonAsciiPrintable' });
// Result: '&lt; &copy;'

// XML-specific encoding
const xmlEncoded = encode('< ©', { mode: 'nonAsciiPrintable', level: 'xml' });
// Result: '&lt; &#169;'

Architecture

HTML Entities is built around three core functions with extensive configuration options:

  • encode(): Converts characters to HTML entities with configurable encoding modes and entity levels
  • decode(): Converts HTML entities back to characters with configurable scopes and levels
  • decodeEntity(): Handles individual entity decoding with level configuration
  • Type System: Complete TypeScript definitions with strict typing for all options and return values
  • Performance Optimization: Pre-compiled regular expressions and entity mappings for maximum speed

Capabilities

HTML Entity Encoding

Encodes text by replacing characters with their corresponding HTML entities. Supports multiple encoding modes from basic HTML special characters to comprehensive character encoding.

/**
 * Encodes all the necessary (specified by level) characters in the text
 * @param text - Text to encode (supports null/undefined, returns empty string)
 * @param options - Encoding configuration options
 * @returns Encoded text with HTML entities
 */
function encode(
  text: string | undefined | null,
  options?: EncodeOptions
): string;

interface EncodeOptions {
  /** Encoding mode - determines which characters to encode */
  mode?: EncodeMode;
  /** Numeric format for character codes */
  numeric?: 'decimal' | 'hexadecimal';  
  /** Entity level/standard to use */
  level?: Level;
}

type EncodeMode = 
  | 'specialChars'        // Only HTML special characters (<>&"')
  | 'nonAscii'           // Special chars + all non-ASCII characters  
  | 'nonAsciiPrintable'  // Special chars + non-ASCII + non-printable ASCII
  | 'nonAsciiPrintableOnly' // Only non-ASCII printable (keeps HTML special chars intact)
  | 'extensive';         // All non-printable, non-ASCII, and characters with named references

type Level = 'xml' | 'html4' | 'html5' | 'all';

Usage Examples:

import { encode } from "html-entities";

// Basic HTML escaping
const basic = encode('<script>alert("XSS")</script>');
// Result: '&lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;'

// Non-ASCII character encoding
const nonAscii = encode('Hello 世界 © 2023', { mode: 'nonAscii' });
// Result: 'Hello &#19990;&#30028; &copy; 2023'

// XML-only entities
const xmlOnly = encode('< > " \' &', { level: 'xml' });
// Result: '&lt; &gt; &quot; &apos; &amp;'

// Hexadecimal numeric entities
const hexEntities = encode('© ∆', { mode: 'nonAscii', numeric: 'hexadecimal' });
// Result: '&#xa9; &#x2206;'

HTML Entity Decoding

Decodes HTML entities back to their original characters. Supports different decoding scopes to match browser parsing behavior in different contexts.

/**
 * Decodes all entities in the text
 * @param text - Text containing HTML entities to decode
 * @param options - Decoding configuration options
 * @returns Decoded text with entities converted to characters
 */
function decode(
  text: string | undefined | null,
  options?: DecodeOptions
): string;

interface DecodeOptions {
  /** Entity level/standard to recognize */
  level?: Level;
  /** Decoding scope - affects handling of entities without semicolons */
  scope?: DecodeScope;
}

type DecodeScope = 
  | 'body'      // Browser behavior in tag bodies (entities without semicolon replaced)
  | 'attribute' // Browser behavior in attributes (entities without semicolon when not followed by =)
  | 'strict';   // Only entities with semicolons

Usage Examples:

import { decode } from "html-entities";

// Basic decoding
const basic = decode('&lt;script&gt;alert(&quot;XSS&quot;)&lt;/script&gt;');
// Result: '<script>alert("XSS")</script>'

// Mixed named and numeric entities
const mixed = decode('&copy; &#169; &#xa9; &#8710;');
// Result: '© © © ∆'

// Strict decoding - only entities with semicolons
const strict = decode('&lt &gt;', { scope: 'strict' });  
// Result: '&lt &gt;' (unchanged - no semicolons)

// Body scope - entities without semicolons decoded
const body = decode('&lt &gt');
// Result: '< >' (decoded despite missing semicolons)

// XML level - only XML entities recognized
const xmlLevel = decode('&copy; &lt; &amp;', { level: 'xml' });
// Result: '&copy; < &' (copyright not decoded in XML)

Single Entity Decoding

Decodes individual HTML entities, useful for processing single entities or building custom decoders.

/**
 * Decodes a single entity
 * @param entity - Single HTML entity to decode (e.g., '&lt;', '&#169;')
 * @param options - Decoding configuration options
 * @returns Decoded character or original entity if unknown
 */
function decodeEntity(
  entity: string | undefined | null,
  options?: CommonOptions
): string;

interface CommonOptions {
  /** Entity level/standard to use for recognition */
  level?: Level;
}

Usage Examples:

import { decodeEntity } from "html-entities";

// Named entity decoding
const named = decodeEntity('&lt;');
// Result: '<'

// Numeric entity decoding
const numeric = decodeEntity('&#169;');
// Result: '©'

// Hexadecimal entity decoding  
const hex = decodeEntity('&#xa9;');
// Result: '©'

// Unknown entity (left unchanged)
const unknown = decodeEntity('&unknownentity;');
// Result: '&unknownentity;'

// Level-specific decoding
const xmlOnly = decodeEntity('&copy;', { level: 'xml' });
// Result: '&copy;' (unchanged - not an XML entity)

const htmlDecoded = decodeEntity('&copy;', { level: 'html5' });
// Result: '©'

Types

// Main configuration types
type Level = 'xml' | 'html4' | 'html5' | 'all';

type EncodeMode = 
  | 'specialChars'        // HTML special characters only: <>&"'
  | 'nonAscii'           // Special chars + non-ASCII characters
  | 'nonAsciiPrintable'  // Special chars + non-ASCII + non-printable ASCII  
  | 'nonAsciiPrintableOnly' // Non-ASCII printable only (preserves HTML special chars)
  | 'extensive';         // Comprehensive encoding of special characters

type DecodeScope = 
  | 'strict'    // Only entities ending with semicolon
  | 'body'      // Browser body parsing (loose semicolon handling)
  | 'attribute'; // Browser attribute parsing (semicolon handling with = check)

// Options interfaces
interface EncodeOptions {
  mode?: EncodeMode;                    // Default: 'specialChars'
  numeric?: 'decimal' | 'hexadecimal'; // Default: 'decimal'  
  level?: Level;                       // Default: 'all'
}

interface DecodeOptions {
  level?: Level;        // Default: 'all'
  scope?: DecodeScope;  // Default: 'body' (or 'strict' for XML level)
}

interface CommonOptions {
  level?: Level;        // Default: 'all'
}

Error Handling

  • Null/undefined inputs: Return empty string
  • Unknown entities: Left unchanged during decoding
  • Invalid numeric entities: Return Unicode replacement character (�) for out-of-bounds values
  • Numeric overflow: Values >= 0x10FFFF return replacement character
  • Surrogate pairs: Properly handled for characters > 65535
  • Malformed entities: Invalid syntax left unchanged