CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-cheerio

The fast, flexible & elegant library for parsing and manipulating HTML and XML.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

Cheerio

Cheerio is a fast, flexible, and elegant library that brings core jQuery functionality to the server. It provides a familiar jQuery-style API with CSS selector support, enabling developers to traverse, manipulate, and extract data from HTML/XML documents without the overhead of a full browser environment.

Package Information

  • Package Name: cheerio
  • Package Type: npm
  • Language: TypeScript
  • Installation: npm install cheerio

Core Imports

import * as cheerio from "cheerio";

For individual functions:

import { load, html, text } from "cheerio";

CommonJS:

const cheerio = require("cheerio");

Alternative slim import (uses only htmlparser2, saves memory):

import { load } from "cheerio/slim";

Basic Usage

import * as cheerio from "cheerio";

// Load HTML document
const $ = cheerio.load('<ul id="fruits"><li class="apple">Apple</li><li class="orange">Orange</li></ul>');

// Query elements
console.log($('.apple').text()); // "Apple"
console.log($('#fruits li').length); // 2

// Manipulate DOM
$('.apple').addClass('selected');
$('ul').append('<li class="pear">Pear</li>');

// Get modified HTML
console.log($.html());

Architecture

Cheerio is built around several key components:

  • Document Loading: Multiple ways to create documents from strings, buffers, URLs, or streams
  • jQuery-Compatible API: Familiar methods for selection, traversal, and manipulation
  • Parser Flexibility: Choice between htmlparser2 (fast) and parse5 (spec-compliant)
  • Server Optimization: No browser dependencies, optimized for server-side HTML processing
  • Type Safety: Full TypeScript support with comprehensive type definitions

Capabilities

Document Loading

Core functionality for loading HTML/XML documents from various sources including strings, buffers, URLs, and streams.

function load(
  content: string | AnyNode | AnyNode[] | Buffer,
  options?: CheerioOptions | null,
  isDocument?: boolean
): CheerioAPI;

function loadBuffer(
  buffer: Buffer,
  options?: DecodeStreamOptions
): CheerioAPI;

function fromURL(
  url: string | URL,
  options?: CheerioRequestOptions
): Promise<CheerioAPI>;

Document Loading

DOM Traversal

Comprehensive DOM traversal methods for navigating and filtering elements, following jQuery conventions.

interface CheerioAPI {
  // Core selector function
  <T extends AnyNode, S extends string>(
    selector?: S | BasicAcceptedElems<T>,
    context?: BasicAcceptedElems<AnyNode> | null,
    root?: BasicAcceptedElems<Document>
  ): Cheerio<S extends SelectorType ? Element : T>;
}

// Key traversal methods
find(selector?: string | Cheerio<Element> | Element): Cheerio<Element>;
parent(selector?: AcceptedFilters<Element>): Cheerio<Element>;
children(selector?: AcceptedFilters<Element>): Cheerio<Element>;
siblings(selector?: AcceptedFilters<Element>): Cheerio<Element>;

DOM Traversal

DOM Manipulation

Methods for modifying the DOM structure including adding, removing, and replacing elements.

// Content manipulation
append(...elems: BasicAcceptedElems<AnyNode>[]): Cheerio<T>;
prepend(...elems: BasicAcceptedElems<AnyNode>[]): Cheerio<T>;
html(str?: string | Cheerio<AnyNode>): Cheerio<T> | string | null;
text(str?: string | function): Cheerio<T> | string;

// Structure manipulation
wrap(wrapper: AcceptedElems<AnyNode>): Cheerio<T>;
remove(selector?: string): Cheerio<T>;
clone(): Cheerio<T>;

DOM Manipulation

Attributes and Properties

Methods for working with element attributes, properties, classes, and data attributes.

// Attributes and properties
attr(name?: string | Record<string, string | null>, value?: string | null | function): any;
prop(name?: string | Record<string, any>, value?: any): any;
data(name?: string | Record<string, unknown>, value?: unknown): any;

// CSS classes
addClass(value?: string | function): Cheerio<T>;
removeClass(name?: string | function): Cheerio<T>;
hasClass(className: string): boolean;

Attributes and Properties

CSS Styling

Methods for reading and modifying CSS styles and properties of elements.

// CSS property access and modification
css(name?: string | string[]): string | Record<string, string> | undefined;
css(prop: string, val: string | ((this: Element, i: number, style: string) => string | undefined)): Cheerio<T>;
css(properties: Record<string, string>): Cheerio<T>;

Note: CSS methods are documented in the Attributes and Properties section.

Form Handling

Specialized methods for working with form elements and extracting form data.

// Form data extraction
serialize(): string;
serializeArray(): { name: string; value: string }[];

// Form element values
val(value?: string | string[]): string | undefined | string[] | Cheerio<T>;

Form Handling

Static Utilities

Static methods for rendering, parsing, and working with DOM nodes without a Cheerio instance.

// Rendering functions
function html(dom?: BasicAcceptedElems<AnyNode>, options?: CheerioOptions): string;
function text(elements?: ArrayLike<AnyNode>): string;
function xml(dom?: BasicAcceptedElems<AnyNode>): string;

// Utility functions
function parseHTML(data?: string | null, context?: unknown, keepScripts?: boolean): AnyNode[] | null;
function contains(container: AnyNode, contained: AnyNode): boolean;

Static Utilities

Utility Functions

Standalone utility functions for common operations and type checking.

// Type checking and DOM utilities
function isCheerio<T>(maybeCheerio: unknown): maybeCheerio is Cheerio<T>;
function domEach<T extends AnyNode, Arr extends ArrayLike<T> = Cheerio<T>>(
  array: Arr,
  fn: (elem: T, index: number) => void
): Arr;

// String manipulation utilities
function camelCase(str: string): string;
function cssCase(str: string): string;
function isHtml(str: string): boolean;

Utility Functions

Core Types

// Core DOM Node Types (from domhandler/htmlparser2)
interface AnyNode {
  type: string;
  parent?: ParentNode | null;
  prev?: AnyNode | null;
  next?: AnyNode | null;
}

interface Element extends AnyNode {
  type: 'tag';
  tagName: string;
  attribs: Record<string, string>;
  children: AnyNode[];
  startIndex?: number;
  endIndex?: number;
}

interface Document extends AnyNode {
  type: 'root';
  children: AnyNode[];
}

interface Text extends AnyNode {
  type: 'text';
  data: string;
}

interface Comment extends AnyNode {
  type: 'comment';
  data: string;
}

type ParentNode = Element | Document;

// CSS Selector Types
type SelectorType = string; // CSS selector patterns like '.class', '#id', 'tag'

// Static Methods Type (utility methods available on CheerioAPI)
interface StaticType {
  html(dom?: BasicAcceptedElems<AnyNode>, options?: CheerioOptions): string;
  xml(dom?: BasicAcceptedElems<AnyNode>): string;
  text(elements?: ArrayLike<AnyNode>): string;
  parseHTML(data?: string | null, context?: unknown, keepScripts?: boolean): AnyNode[] | null;
  root(): Cheerio<Document>;
  contains(container: AnyNode, contained: AnyNode): boolean;
  extract<M extends ExtractMap>(map: M): ExtractedMap<M>;
  merge<T>(arr1: ArrayLike<T>, arr2: ArrayLike<T>): ArrayLike<T> | undefined;
}

// Main API interface
interface CheerioAPI extends StaticType {
  <T extends AnyNode, S extends string>(
    selector?: S | BasicAcceptedElems<T>,
    context?: BasicAcceptedElems<AnyNode> | null,
    root?: BasicAcceptedElems<Document>
  ): Cheerio<S extends SelectorType ? Element : T>;
  
  _root: Document;
  _options: InternalOptions;
  fn: typeof Cheerio.prototype;
}

// Element wrapper class
abstract class Cheerio<T> implements ArrayLike<T> {
  length: number;
  [index: number]: T;
  options: InternalOptions;
}

// Internal configuration (extends CheerioOptions with internal flags)
interface InternalOptions extends CheerioOptions {
  _useHtmlParser2?: boolean;
  [key: string]: any; // Additional parser-specific options
}

// Configuration options
interface CheerioOptions {
  xml?: HTMLParser2Options | boolean;
  xmlMode?: boolean;
  baseURI?: string | URL;
  quirksMode?: boolean;
  pseudos?: Record<string, string | Function>;
}

// HTML Parser options
interface HTMLParser2Options {
  xmlMode?: boolean;
  decodeEntities?: boolean;
  lowerCaseAttributeNames?: boolean;
  recognizeSelfClosing?: boolean;
  recognizeCDATA?: boolean;
  [key: string]: any;
}

// Element types for manipulation methods
type BasicAcceptedElems<T extends AnyNode> = ArrayLike<T> | T | string;
type AcceptedElems<T extends AnyNode> = BasicAcceptedElems<T> | ((this: T, i: number, el: T) => BasicAcceptedElems<T>);
type AcceptedFilters<T> = string | FilterFunction<T> | T | Cheerio<T>;
type FilterFunction<T> = (this: T, i: number, el: T) => boolean;

// External library types (from dependencies)
type UndiciStreamOptions = {
  method?: string;
  headers?: Record<string, string>;
  body?: any;
  // Additional options from undici RequestOptions
  [key: string]: any;
};

type SnifferOptions = {
  defaultEncoding?: string;
  transportLayerEncodingLabel?: string;
  // Additional options from encoding-sniffer
  [key: string]: any;
};

// Node.js stream type
interface Writable {
  write(chunk: any, encoding?: string, callback?: Function): boolean;
  end(chunk?: any, encoding?: string, callback?: Function): void;
  pipe<T extends NodeJS.WritableStream>(destination: T): T;
}

// Data extraction types
interface ExtractMap {
  [key: string]: string | ExtractConfig;
}

interface ExtractConfig {
  selector: string;
  value?: (element: AnyNode) => any;
  attribute?: string;
}

type ExtractedMap<M extends ExtractMap> = {
  [K in keyof M]: M[K] extends string 
    ? string | string[]
    : M[K] extends ExtractConfig
    ? any
    : never;
};

Install with Tessl CLI

npx tessl i tessl/npm-cheerio
Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/cheerio@1.1.x
Publish Source
CLI
Badge
tessl/npm-cheerio badge