CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

query-selection.mddocs/

Query & Selection

Powerful element querying capabilities using CSS selectors, tag names, IDs, and DOM traversal methods for finding and navigating HTML elements within the parsed DOM tree.

Capabilities

CSS Selector Queries

Query elements using standard CSS selector syntax with support for classes, IDs, attributes, and combinators.

/** Find first element matching CSS selector */
querySelector(selector: string): HTMLElement | null;

/** Find all elements matching CSS selector */
querySelectorAll(selector: string): HTMLElement[];

Supported CSS Selectors:

  • Type selectors: div, p, span
  • Class selectors: .class-name, .multiple.classes
  • ID selectors: #element-id
  • Attribute selectors: [attr], [attr=value], [attr*=value]
  • Descendant combinators: div p, .parent .child
  • Child combinators: div > p
  • Adjacent sibling: h1 + p
  • Pseudo-selectors: :first-child, :last-child, :nth-child()

Usage Examples:

import { parse } from "node-html-parser";

const html = `
<div class="container">
  <header id="main-header">
    <h1>Title</h1>
    <nav class="menu">
      <ul>
        <li><a href="/home">Home</a></li>
        <li><a href="/about" class="active">About</a></li>
      </ul>
    </nav>
  </header>
  <main>
    <p class="intro">Introduction text</p>
    <article data-id="123">
      <h2>Article Title</h2>
      <p>Article content</p>
    </article>
  </main>
</div>`;

const root = parse(html);

// Basic selectors
const header = root.querySelector('#main-header');
const intro = root.querySelector('.intro');
const articles = root.querySelectorAll('article');

// Compound selectors  
const activeLink = root.querySelector('a.active');
const navLinks = root.querySelectorAll('nav a');

// Attribute selectors
const dataElement = root.querySelector('[data-id]');
const specificData = root.querySelector('[data-id="123"]');

// Descendant selectors
const headerTitle = root.querySelector('header h1');
const listItems = root.querySelectorAll('ul li');

// Child selectors
const directChildren = root.querySelectorAll('.container > header');

// Multiple results
const allParagraphs = root.querySelectorAll('p');
allParagraphs.forEach(p => console.log(p.text));

Element Queries by Tag

Find elements by their HTML tag name with wildcard support.

/** 
 * Find all elements with specified tag name
 * @param tagName - Tag name to search for, or '*' for all elements
 * @returns Array of matching HTMLElements
 */
getElementsByTagName(tagName: string): HTMLElement[];

Usage Examples:

const root = parse(`
<div>
  <p>First paragraph</p>
  <span>Span content</span>
  <p>Second paragraph</p>
</div>
`);

// Find by specific tag
const paragraphs = root.getElementsByTagName('p');
console.log(paragraphs.length); // 2

// Find all elements
const allElements = root.getElementsByTagName('*');
console.log(allElements.length); // 4 (div, p, span, p)

// Case insensitive
const divs = root.getElementsByTagName('DIV');
console.log(divs.length); // 1

Element Query by ID

Find a single element by its ID attribute.

/**
 * Find element with specified ID
 * @param id - ID value to search for
 * @returns HTMLElement with matching ID, or null if not found
 */
getElementById(id: string): HTMLElement | null;

Usage Examples:

const root = parse(`
<div>
  <header id="main-header">Header</header>
  <section id="content">
    <p id="intro">Introduction</p>
  </section>
</div>
`);

const header = root.getElementById('main-header');
const intro = root.getElementById('intro');
const missing = root.getElementById('not-found'); // null

console.log(header?.tagName); // "HEADER"
console.log(intro?.text);     // "Introduction"

Ancestor Traversal

Find the closest ancestor element matching a CSS selector.

/**
 * Traverse up the DOM tree to find closest ancestor matching selector
 * @param selector - CSS selector to match against ancestors
 * @returns Closest matching ancestor HTMLElement, or null if none found
 */
closest(selector: string): HTMLElement | null;

Usage Examples:

const root = parse(`
<article class="post">
  <header>
    <h1>Title</h1>
  </header>
  <div class="content">
    <p>
      <em id="emphasis">emphasized text</em>
    </p>
  </div>
</article>
`);

const emphasis = root.getElementById('emphasis');

// Find closest paragraph
const paragraph = emphasis.closest('p');
console.log(paragraph?.tagName); // "P"

// Find closest article
const article = emphasis.closest('article');
console.log(article?.classList.contains('post')); // true

// Find closest with class
const content = emphasis.closest('.content');
console.log(content?.tagName); // "DIV"

// No match
const missing = emphasis.closest('table'); // null

Advanced Query Patterns

Chaining Queries

Combine multiple query methods for complex element finding:

const root = parse(`
<div class="app">
  <nav class="sidebar">
    <ul class="menu">
      <li><a href="/dashboard" class="active">Dashboard</a></li>
      <li><a href="/settings">Settings</a></li>
    </ul>
  </nav>
</div>
`);

// Chain queries
const activeLink = root
  .querySelector('.sidebar')
  ?.querySelector('.menu')  
  ?.querySelector('a.active');

console.log(activeLink?.text); // "Dashboard"

// Alternative approach
const menu = root.querySelector('.menu');
const activeItem = menu?.querySelector('.active');

Iterating Query Results

Work with multiple elements returned by queries:

const root = parse(`
<table>
  <tr><td>Name</td><td>Age</td></tr>
  <tr><td>Alice</td><td>25</td></tr>
  <tr><td>Bob</td><td>30</td></tr>
</table>
`);

// Process all rows
const rows = root.querySelectorAll('tr');
rows.forEach((row, index) => {
  const cells = row.querySelectorAll('td');
  console.log(`Row ${index}:`, cells.map(cell => cell.text));
});

// Filter results
const dataCells = root.querySelectorAll('td');
const ages = dataCells
  .filter(cell => !isNaN(Number(cell.text)))
  .map(cell => Number(cell.text));

console.log('Ages:', ages); // [25, 30]

Conditional Queries

Handle cases where elements might not exist:

const root = parse('<div><p>Content</p></div>');

// Safe querying with optional chaining
const text = root.querySelector('section')?.querySelector('p')?.text ?? 'Default';

// Explicit null checks
const section = root.querySelector('section');
if (section) {
  const paragraph = section.querySelector('p');
  if (paragraph) {
    console.log(paragraph.text);
  }
}

// Using || for defaults
const title = root.querySelector('h1')?.text || 'No title found';

Query Performance Tips

  • Specific selectors: Use IDs and specific classes when possible
  • Limit scope: Query from specific parent elements rather than root
  • Cache results: Store frequently accessed elements in variables
  • Batch operations: Collect all needed elements before processing
// Good: Specific and scoped
const sidebar = root.getElementById('sidebar');
const menuItems = sidebar?.querySelectorAll('li') ?? [];

// Less optimal: Broad queries
const allItems = root.querySelectorAll('li'); // Searches entire document

// Cache frequently used elements
const header = root.querySelector('header');
if (header) {
  const nav = header.querySelector('nav');
  const logo = header.querySelector('.logo');
  // Use cached elements for multiple operations
}

docs

attributes-properties.md

dom-elements.md

index.md

node-types.md

parsing.md

query-selection.md

tile.json