0
# URL Parsing
1
2
Low-level URL parsing functions that follow the WHATWG URL specification for building URL-aware applications and browser engines. These functions operate on URL record objects rather than URL class instances.
3
4
## Capabilities
5
6
### parseURL
7
8
The main URL parser that follows the complete WHATWG URL parsing algorithm.
9
10
```javascript { .api }
11
/**
12
* Parse a URL string according to WHATWG URL specification
13
* @param {string} input - URL string to parse
14
* @param {Object} [options] - Parsing options
15
* @param {URLRecord|null} [options.baseURL] - Base URL for relative URLs
16
* @returns {URLRecord|null} URL record object or null if parsing fails
17
*/
18
function parseURL(input, options)
19
```
20
21
**Usage Examples:**
22
23
```javascript
24
const { parseURL } = require("whatwg-url");
25
26
// Parse absolute URL
27
const absolute = parseURL("https://example.com/path?query=value#fragment");
28
console.log(absolute);
29
// {
30
// scheme: "https",
31
// username: "",
32
// password: "",
33
// host: "example.com",
34
// port: null,
35
// path: ["", "path"],
36
// query: "query=value",
37
// fragment: "fragment"
38
// }
39
40
// Parse relative URL with base
41
const base = parseURL("https://example.com/api/");
42
const relative = parseURL("../users", { baseURL: base });
43
console.log(relative.path); // ["", "users"]
44
45
// Handle parsing failure
46
const invalid = parseURL("not-a-valid-url");
47
console.log(invalid); // null
48
```
49
50
### basicURLParse
51
52
The basic URL parser with additional options for state override, used internally by the URL parser.
53
54
```javascript { .api }
55
/**
56
* Basic URL parser with state override support
57
* @param {string} input - URL string to parse
58
* @param {Object} [options] - Parsing options
59
* @param {URLRecord|null} [options.baseURL] - Base URL for relative URLs
60
* @param {URLRecord} [options.url] - Existing URL record to modify
61
* @param {string} [options.stateOverride] - Parser state to start from
62
* @returns {URLRecord|null} URL record object or null if parsing fails
63
*/
64
function basicURLParse(input, options)
65
```
66
67
**State Override Values:**
68
69
- `"scheme start"`, `"scheme"`, `"no scheme"`
70
- `"special relative or authority"`, `"path or authority"`, `"relative"`
71
- `"relative slash"`, `"special authority slashes"`, `"special authority ignore slashes"`
72
- `"authority"`, `"host"`, `"hostname"`, `"port"`
73
- `"file"`, `"file slash"`, `"file host"`
74
- `"path start"`, `"path"`, `"opaque path"`
75
- `"query"`, `"fragment"`
76
77
**Usage Examples:**
78
79
```javascript
80
const { basicURLParse } = require("whatwg-url");
81
82
// Basic parsing (equivalent to parseURL)
83
const url = basicURLParse("https://example.com/path");
84
85
// Parse with state override (advanced usage)
86
const existingUrl = parseURL("https://example.com/");
87
const modified = basicURLParse("new-path", {
88
url: existingUrl,
89
stateOverride: "path"
90
});
91
92
// Parse hostname only
93
const hostResult = basicURLParse("example.com", {
94
stateOverride: "hostname"
95
});
96
```
97
98
### serializeURL
99
100
Serializes a URL record back to a string representation.
101
102
```javascript { .api }
103
/**
104
* Serialize a URL record to string
105
* @param {URLRecord} urlRecord - URL record to serialize
106
* @param {boolean} [excludeFragment=false] - Whether to exclude fragment
107
* @returns {string} Serialized URL string
108
*/
109
function serializeURL(urlRecord, excludeFragment)
110
```
111
112
### serializePath
113
114
Serializes the path component of a URL record.
115
116
```javascript { .api }
117
/**
118
* Serialize the path component of a URL record
119
* @param {URLRecord} urlRecord - URL record with path to serialize
120
* @returns {string} Serialized path string
121
*/
122
function serializePath(urlRecord)
123
```
124
125
### serializeHost
126
127
Serializes the host component of a URL record.
128
129
```javascript { .api }
130
/**
131
* Serialize the host component of a URL record
132
* @param {string|null} host - Host from URL record
133
* @returns {string} Serialized host string
134
*/
135
function serializeHost(host)
136
```
137
138
### serializeURLOrigin
139
140
Serializes the origin of a URL record according to the WHATWG specification.
141
142
```javascript { .api }
143
/**
144
* Serialize the origin of a URL record
145
* @param {URLRecord} urlRecord - URL record to serialize origin from
146
* @returns {string} Serialized origin string or "null" for opaque origins
147
*/
148
function serializeURLOrigin(urlRecord)
149
```
150
151
### serializeInteger
152
153
Serializes an integer according to the URL specification.
154
155
```javascript { .api }
156
/**
157
* Serialize an integer according to URL specification
158
* @param {number} number - Integer to serialize
159
* @returns {string} Serialized integer string
160
*/
161
function serializeInteger(number)
162
```
163
164
**Usage Examples:**
165
166
```javascript
167
const {
168
parseURL,
169
serializeURL,
170
serializePath,
171
serializeHost,
172
serializeURLOrigin,
173
serializeInteger
174
} = require("whatwg-url");
175
176
// Parse and serialize
177
const url = parseURL("https://example.com:8080/path/to/resource?query=value#section");
178
179
console.log(serializeURL(url));
180
// "https://example.com:8080/path/to/resource?query=value#section"
181
182
console.log(serializeURL(url, true)); // exclude fragment
183
// "https://example.com:8080/path/to/resource?query=value"
184
185
console.log(serializePath(url));
186
// "/path/to/resource"
187
188
console.log(serializeHost(url.host));
189
// "example.com:8080"
190
191
// Serialize custom URL record
192
const customUrl = {
193
scheme: "http",
194
username: "",
195
password: "",
196
host: "localhost",
197
port: 3000,
198
path: ["", "api", "users"],
199
query: "limit=10",
200
fragment: null
201
};
202
203
console.log(serializeURL(customUrl));
204
// "http://localhost:3000/api/users?limit=10"
205
206
// Serialize origins
207
console.log(serializeURLOrigin(url));
208
// "https://example.com:8080"
209
210
// Different schemes return different origins
211
const fileUrl = parseURL("file:///path/to/file.txt");
212
console.log(serializeURLOrigin(fileUrl)); // "null"
213
214
const dataUrl = parseURL("data:text/plain,Hello");
215
console.log(serializeURLOrigin(dataUrl)); // "null"
216
217
// Serialize integers
218
console.log(serializeInteger(8080)); // "8080"
219
console.log(serializeInteger(443)); // "443"
220
```
221
222
## URL Record Type
223
224
The URL record represents the internal structure of a parsed URL.
225
226
```javascript { .api }
227
/**
228
* URL record object structure
229
*/
230
interface URLRecord {
231
/** URL scheme (e.g., "https", "file") */
232
scheme: string;
233
234
/** Username component */
235
username: string;
236
237
/** Password component */
238
password: string;
239
240
/** Host component (domain, IP, or null for file: URLs) */
241
host: string | null;
242
243
/** Port number (null for default ports) */
244
port: number | null;
245
246
/** Path segments array (normal URLs) or opaque path string */
247
path: string[] | string;
248
249
/** Query string component (null if no query) */
250
query: string | null;
251
252
/** Fragment component (null if no fragment) */
253
fragment: string | null;
254
}
255
```
256
257
**Path Handling:**
258
259
- **Normal URLs** (http, https, ftp, etc.): `path` is an array of path segments
260
- **Opaque URLs** (data, mailto, etc.): `path` is a single string
261
262
**Usage Example:**
263
264
```javascript
265
// Normal URL with path array
266
const httpUrl = parseURL("https://example.com/api/users/123");
267
console.log(httpUrl.path); // ["", "api", "users", "123"]
268
269
// Opaque URL with path string
270
const dataUrl = parseURL("data:text/plain;base64,SGVsbG8gV29ybGQ=");
271
console.log(dataUrl.path); // "text/plain;base64,SGVsbG8gV29ybGQ="
272
console.log(typeof dataUrl.path); // "string"
273
```