or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

attributes.mdforms.mdindex.mdloading.mdmanipulation.mdstatic-methods.mdtraversal.mdutils.md
tile.json

tessl/npm-cheerio

The fast, flexible & elegant library for parsing and manipulating HTML and XML.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/cheerio@1.1.x

To install, run

npx @tessl/cli install tessl/npm-cheerio@1.1.0

index.mddocs/

Cheerio

Cheerio is a fast, flexible, and elegant library that brings core jQuery functionality to the server. It provides a familiar jQuery-style API with CSS selector support, enabling developers to traverse, manipulate, and extract data from HTML/XML documents without the overhead of a full browser environment.

Package Information

  • Package Name: cheerio
  • Package Type: npm
  • Language: TypeScript
  • Installation: npm install cheerio

Core Imports

import * as cheerio from "cheerio";

For individual functions:

import { load, html, text } from "cheerio";

CommonJS:

const cheerio = require("cheerio");

Alternative slim import (uses only htmlparser2, saves memory):

import { load } from "cheerio/slim";

Basic Usage

import * as cheerio from "cheerio";

// Load HTML document
const $ = cheerio.load('<ul id="fruits"><li class="apple">Apple</li><li class="orange">Orange</li></ul>');

// Query elements
console.log($('.apple').text()); // "Apple"
console.log($('#fruits li').length); // 2

// Manipulate DOM
$('.apple').addClass('selected');
$('ul').append('<li class="pear">Pear</li>');

// Get modified HTML
console.log($.html());

Architecture

Cheerio is built around several key components:

  • Document Loading: Multiple ways to create documents from strings, buffers, URLs, or streams
  • jQuery-Compatible API: Familiar methods for selection, traversal, and manipulation
  • Parser Flexibility: Choice between htmlparser2 (fast) and parse5 (spec-compliant)
  • Server Optimization: No browser dependencies, optimized for server-side HTML processing
  • Type Safety: Full TypeScript support with comprehensive type definitions

Capabilities

Document Loading

Core functionality for loading HTML/XML documents from various sources including strings, buffers, URLs, and streams.

function load(
  content: string | AnyNode | AnyNode[] | Buffer,
  options?: CheerioOptions | null,
  isDocument?: boolean
): CheerioAPI;

function loadBuffer(
  buffer: Buffer,
  options?: DecodeStreamOptions
): CheerioAPI;

function fromURL(
  url: string | URL,
  options?: CheerioRequestOptions
): Promise<CheerioAPI>;

Document Loading

DOM Traversal

Comprehensive DOM traversal methods for navigating and filtering elements, following jQuery conventions.

interface CheerioAPI {
  // Core selector function
  <T extends AnyNode, S extends string>(
    selector?: S | BasicAcceptedElems<T>,
    context?: BasicAcceptedElems<AnyNode> | null,
    root?: BasicAcceptedElems<Document>
  ): Cheerio<S extends SelectorType ? Element : T>;
}

// Key traversal methods
find(selector?: string | Cheerio<Element> | Element): Cheerio<Element>;
parent(selector?: AcceptedFilters<Element>): Cheerio<Element>;
children(selector?: AcceptedFilters<Element>): Cheerio<Element>;
siblings(selector?: AcceptedFilters<Element>): Cheerio<Element>;

DOM Traversal

DOM Manipulation

Methods for modifying the DOM structure including adding, removing, and replacing elements.

// Content manipulation
append(...elems: BasicAcceptedElems<AnyNode>[]): Cheerio<T>;
prepend(...elems: BasicAcceptedElems<AnyNode>[]): Cheerio<T>;
html(str?: string | Cheerio<AnyNode>): Cheerio<T> | string | null;
text(str?: string | function): Cheerio<T> | string;

// Structure manipulation
wrap(wrapper: AcceptedElems<AnyNode>): Cheerio<T>;
remove(selector?: string): Cheerio<T>;
clone(): Cheerio<T>;

DOM Manipulation

Attributes and Properties

Methods for working with element attributes, properties, classes, and data attributes.

// Attributes and properties
attr(name?: string | Record<string, string | null>, value?: string | null | function): any;
prop(name?: string | Record<string, any>, value?: any): any;
data(name?: string | Record<string, unknown>, value?: unknown): any;

// CSS classes
addClass(value?: string | function): Cheerio<T>;
removeClass(name?: string | function): Cheerio<T>;
hasClass(className: string): boolean;

Attributes and Properties

CSS Styling

Methods for reading and modifying CSS styles and properties of elements.

// CSS property access and modification
css(name?: string | string[]): string | Record<string, string> | undefined;
css(prop: string, val: string | ((this: Element, i: number, style: string) => string | undefined)): Cheerio<T>;
css(properties: Record<string, string>): Cheerio<T>;

Note: CSS methods are documented in the Attributes and Properties section.

Form Handling

Specialized methods for working with form elements and extracting form data.

// Form data extraction
serialize(): string;
serializeArray(): { name: string; value: string }[];

// Form element values
val(value?: string | string[]): string | undefined | string[] | Cheerio<T>;

Form Handling

Static Utilities

Static methods for rendering, parsing, and working with DOM nodes without a Cheerio instance.

// Rendering functions
function html(dom?: BasicAcceptedElems<AnyNode>, options?: CheerioOptions): string;
function text(elements?: ArrayLike<AnyNode>): string;
function xml(dom?: BasicAcceptedElems<AnyNode>): string;

// Utility functions
function parseHTML(data?: string | null, context?: unknown, keepScripts?: boolean): AnyNode[] | null;
function contains(container: AnyNode, contained: AnyNode): boolean;

Static Utilities

Utility Functions

Standalone utility functions for common operations and type checking.

// Type checking and DOM utilities
function isCheerio<T>(maybeCheerio: unknown): maybeCheerio is Cheerio<T>;
function domEach<T extends AnyNode, Arr extends ArrayLike<T> = Cheerio<T>>(
  array: Arr,
  fn: (elem: T, index: number) => void
): Arr;

// String manipulation utilities
function camelCase(str: string): string;
function cssCase(str: string): string;
function isHtml(str: string): boolean;

Utility Functions

Core Types

// Core DOM Node Types (from domhandler/htmlparser2)
interface AnyNode {
  type: string;
  parent?: ParentNode | null;
  prev?: AnyNode | null;
  next?: AnyNode | null;
}

interface Element extends AnyNode {
  type: 'tag';
  tagName: string;
  attribs: Record<string, string>;
  children: AnyNode[];
  startIndex?: number;
  endIndex?: number;
}

interface Document extends AnyNode {
  type: 'root';
  children: AnyNode[];
}

interface Text extends AnyNode {
  type: 'text';
  data: string;
}

interface Comment extends AnyNode {
  type: 'comment';
  data: string;
}

type ParentNode = Element | Document;

// CSS Selector Types
type SelectorType = string; // CSS selector patterns like '.class', '#id', 'tag'

// Static Methods Type (utility methods available on CheerioAPI)
interface StaticType {
  html(dom?: BasicAcceptedElems<AnyNode>, options?: CheerioOptions): string;
  xml(dom?: BasicAcceptedElems<AnyNode>): string;
  text(elements?: ArrayLike<AnyNode>): string;
  parseHTML(data?: string | null, context?: unknown, keepScripts?: boolean): AnyNode[] | null;
  root(): Cheerio<Document>;
  contains(container: AnyNode, contained: AnyNode): boolean;
  extract<M extends ExtractMap>(map: M): ExtractedMap<M>;
  merge<T>(arr1: ArrayLike<T>, arr2: ArrayLike<T>): ArrayLike<T> | undefined;
}

// Main API interface
interface CheerioAPI extends StaticType {
  <T extends AnyNode, S extends string>(
    selector?: S | BasicAcceptedElems<T>,
    context?: BasicAcceptedElems<AnyNode> | null,
    root?: BasicAcceptedElems<Document>
  ): Cheerio<S extends SelectorType ? Element : T>;
  
  _root: Document;
  _options: InternalOptions;
  fn: typeof Cheerio.prototype;
}

// Element wrapper class
abstract class Cheerio<T> implements ArrayLike<T> {
  length: number;
  [index: number]: T;
  options: InternalOptions;
}

// Internal configuration (extends CheerioOptions with internal flags)
interface InternalOptions extends CheerioOptions {
  _useHtmlParser2?: boolean;
  [key: string]: any; // Additional parser-specific options
}

// Configuration options
interface CheerioOptions {
  xml?: HTMLParser2Options | boolean;
  xmlMode?: boolean;
  baseURI?: string | URL;
  quirksMode?: boolean;
  pseudos?: Record<string, string | Function>;
}

// HTML Parser options
interface HTMLParser2Options {
  xmlMode?: boolean;
  decodeEntities?: boolean;
  lowerCaseAttributeNames?: boolean;
  recognizeSelfClosing?: boolean;
  recognizeCDATA?: boolean;
  [key: string]: any;
}

// Element types for manipulation methods
type BasicAcceptedElems<T extends AnyNode> = ArrayLike<T> | T | string;
type AcceptedElems<T extends AnyNode> = BasicAcceptedElems<T> | ((this: T, i: number, el: T) => BasicAcceptedElems<T>);
type AcceptedFilters<T> = string | FilterFunction<T> | T | Cheerio<T>;
type FilterFunction<T> = (this: T, i: number, el: T) => boolean;

// External library types (from dependencies)
type UndiciStreamOptions = {
  method?: string;
  headers?: Record<string, string>;
  body?: any;
  // Additional options from undici RequestOptions
  [key: string]: any;
};

type SnifferOptions = {
  defaultEncoding?: string;
  transportLayerEncodingLabel?: string;
  // Additional options from encoding-sniffer
  [key: string]: any;
};

// Node.js stream type
interface Writable {
  write(chunk: any, encoding?: string, callback?: Function): boolean;
  end(chunk?: any, encoding?: string, callback?: Function): void;
  pipe<T extends NodeJS.WritableStream>(destination: T): T;
}

// Data extraction types
interface ExtractMap {
  [key: string]: string | ExtractConfig;
}

interface ExtractConfig {
  selector: string;
  value?: (element: AnyNode) => any;
  attribute?: string;
}

type ExtractedMap<M extends ExtractMap> = {
  [K in keyof M]: M[K] extends string 
    ? string | string[]
    : M[K] extends ExtractConfig
    ? any
    : never;
};