0
# micromark-util-normalize-identifier
1
2
micromark-util-normalize-identifier is a utility package that provides identifier normalization functionality for markdown parsers. It normalizes identifiers found in references and definitions by collapsing whitespace, trimming, and performing case normalization to create canonical forms for reliable identifier matching.
3
4
## Package Information
5
6
- **Package Name**: micromark-util-normalize-identifier
7
- **Package Type**: npm
8
- **Language**: JavaScript (ESM)
9
- **Installation**: `npm install micromark-util-normalize-identifier`
10
11
## Core Imports
12
13
```javascript
14
import { normalizeIdentifier } from "micromark-util-normalize-identifier";
15
```
16
17
## Basic Usage
18
19
```javascript
20
import { normalizeIdentifier } from "micromark-util-normalize-identifier";
21
22
// Basic whitespace normalization and trimming
23
normalizeIdentifier(' a '); // → 'A'
24
normalizeIdentifier('a\t\r\nb'); // → 'A B'
25
26
// Unicode case normalization
27
normalizeIdentifier('ТОЛПОЙ'); // → 'ТОЛПОЙ'
28
normalizeIdentifier('Толпой'); // → 'ТОЛПОЙ'
29
30
// Complex identifiers with mixed whitespace
31
normalizeIdentifier(' My Reference \n ID '); // → 'MY REFERENCE ID'
32
```
33
34
## Capabilities
35
36
### Identifier Normalization
37
38
Normalizes markdown identifiers to create canonical forms for reference matching. The normalization process ensures consistent identifier handling across markdown parsers by collapsing whitespace, trimming, and performing proper case conversion.
39
40
```javascript { .api }
41
/**
42
* Normalize an identifier (as found in references, definitions).
43
*
44
* Collapses markdown whitespace, trim, and then lower- and uppercase.
45
* Some characters are considered "uppercase", such as U+03F4 (ϴ), but if their
46
* lowercase counterpart (U+03B8 (θ)) is uppercased will result in a different
47
* uppercase character (U+0398 (Θ)). So, to get a canonical form, we perform
48
* both lower- and uppercase. Using uppercase last makes sure keys will never
49
* interact with default prototypal values (such as constructor): nothing in
50
* the prototype of Object is uppercase.
51
*
52
* @param {string} value - Identifier to normalize
53
* @returns {string} Normalized identifier
54
*/
55
function normalizeIdentifier(value);
56
```
57
58
**Algorithm Details:**
59
60
1. **Whitespace Collapse**: Replaces all markdown whitespace sequences (`[\t\n\r ]+`) with single space characters
61
2. **Trimming**: Removes leading and trailing spaces using regex `/^ | $/g`
62
3. **Case Normalization**: Converts to lowercase first, then uppercase to handle Unicode edge cases
63
4. **Security**: Uppercase conversion prevents interaction with Object prototype methods
64
65
**Usage Examples:**
66
67
```javascript
68
import { normalizeIdentifier } from "micromark-util-normalize-identifier";
69
70
// Markdown reference normalization
71
const ref1 = normalizeIdentifier('[my link]'); // → '[MY LINK]'
72
const ref2 = normalizeIdentifier('[My Link]'); // → '[MY LINK]'
73
const ref3 = normalizeIdentifier('[ my link ]'); // → '[MY LINK]'
74
75
// Definition normalization
76
const def = normalizeIdentifier('My Definition Label'); // → 'MY DEFINITION LABEL'
77
78
// Unicode handling
79
const unicode1 = normalizeIdentifier('θεός'); // → 'ΘΕΟΣ'
80
const unicode2 = normalizeIdentifier('ΘΕΟΣ'); // → 'ΘΕΟΣ'
81
```
82
83
**Error Handling:**
84
85
The function expects a string input. Non-string inputs will cause runtime errors during string method calls. Always ensure the input is a string:
86
87
```javascript
88
// Safe usage
89
const normalized = normalizeIdentifier(String(input));
90
```