or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/npm-micromark-util-normalize-identifier

micromark utility normalize identifiers (as found in references, definitions)

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/micromark-util-normalize-identifier@2.0.x

To install, run

npx @tessl/cli install tessl/npm-micromark-util-normalize-identifier@2.0.0

index.mddocs/

micromark-util-normalize-identifier

micromark-util-normalize-identifier is a utility package that provides identifier normalization functionality for markdown parsers. It normalizes identifiers found in references and definitions by collapsing whitespace, trimming, and performing case normalization to create canonical forms for reliable identifier matching.

Package Information

  • Package Name: micromark-util-normalize-identifier
  • Package Type: npm
  • Language: JavaScript (ESM)
  • Installation: npm install micromark-util-normalize-identifier

Core Imports

import { normalizeIdentifier } from "micromark-util-normalize-identifier";

Basic Usage

import { normalizeIdentifier } from "micromark-util-normalize-identifier";

// Basic whitespace normalization and trimming
normalizeIdentifier(' a ');        // → 'A'
normalizeIdentifier('a\t\r\nb');   // → 'A B'

// Unicode case normalization
normalizeIdentifier('ТОЛПОЙ');     // → 'ТОЛПОЙ'
normalizeIdentifier('Толпой');     // → 'ТОЛПОЙ'

// Complex identifiers with mixed whitespace
normalizeIdentifier('  My   Reference  \n  ID  ');  // → 'MY REFERENCE ID'

Capabilities

Identifier Normalization

Normalizes markdown identifiers to create canonical forms for reference matching. The normalization process ensures consistent identifier handling across markdown parsers by collapsing whitespace, trimming, and performing proper case conversion.

/**
 * Normalize an identifier (as found in references, definitions).
 * 
 * Collapses markdown whitespace, trim, and then lower- and uppercase.
 * Some characters are considered "uppercase", such as U+03F4 (ϴ), but if their
 * lowercase counterpart (U+03B8 (θ)) is uppercased will result in a different
 * uppercase character (U+0398 (Θ)). So, to get a canonical form, we perform
 * both lower- and uppercase. Using uppercase last makes sure keys will never
 * interact with default prototypal values (such as constructor): nothing in
 * the prototype of Object is uppercase.
 * 
 * @param {string} value - Identifier to normalize
 * @returns {string} Normalized identifier
 */
function normalizeIdentifier(value);

Algorithm Details:

  1. Whitespace Collapse: Replaces all markdown whitespace sequences ([\t\n\r ]+) with single space characters
  2. Trimming: Removes leading and trailing spaces using regex /^ | $/g
  3. Case Normalization: Converts to lowercase first, then uppercase to handle Unicode edge cases
  4. Security: Uppercase conversion prevents interaction with Object prototype methods

Usage Examples:

import { normalizeIdentifier } from "micromark-util-normalize-identifier";

// Markdown reference normalization
const ref1 = normalizeIdentifier('[my link]');     // → '[MY LINK]'
const ref2 = normalizeIdentifier('[My   Link]');   // → '[MY LINK]'
const ref3 = normalizeIdentifier('[ my link ]');   // → '[MY LINK]'

// Definition normalization  
const def = normalizeIdentifier('My Definition Label');  // → 'MY DEFINITION LABEL'

// Unicode handling
const unicode1 = normalizeIdentifier('θεός');      // → 'ΘΕΟΣ'
const unicode2 = normalizeIdentifier('ΘΕΟΣ');      // → 'ΘΕΟΣ'

Error Handling:

The function expects a string input. Non-string inputs will cause runtime errors during string method calls. Always ensure the input is a string:

// Safe usage
const normalized = normalizeIdentifier(String(input));