or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.md

index.mddocs/

0

# tldjs

1

2

tldjs is a JavaScript library for working with complex domain names, subdomains and well-known TLDs. It provides utilities to parse URLs/hostnames and extract domain components based on Mozilla's Public Suffix List, answering questions like "what is mail.google.com's domain?" and "is big.data's TLD well-known?".

3

4

## Package Information

5

6

- **Package Name**: tldjs

7

- **Package Type**: npm

8

- **Language**: JavaScript

9

- **Installation**: `npm install tldjs`

10

11

## Core Imports

12

13

```javascript

14

const { parse, tldExists, getDomain, getSubdomain, getPublicSuffix, isValidHostname, extractHostname } = require('tldjs');

15

```

16

17

Or import the entire module:

18

19

```javascript

20

const tldjs = require('tldjs');

21

```

22

23

For ES6 modules:

24

25

```javascript

26

import { parse, tldExists, getDomain, getSubdomain, getPublicSuffix, isValidHostname, extractHostname } from 'tldjs';

27

```

28

29

Or import the entire module:

30

31

```javascript

32

import tldjs from 'tldjs';

33

```

34

35

## Basic Usage

36

37

```javascript

38

const tldjs = require('tldjs');

39

40

// Parse a URL completely

41

const result = tldjs.parse('https://spark-public.s3.amazonaws.com/dataanalysis/loansData.csv');

42

console.log(result);

43

// {

44

// hostname: 'spark-public.s3.amazonaws.com',

45

// isValid: true,

46

// isIp: false,

47

// tldExists: true,

48

// publicSuffix: 's3.amazonaws.com',

49

// domain: 'spark-public.s3.amazonaws.com',

50

// subdomain: ''

51

// }

52

53

// Check if TLD exists

54

console.log(tldjs.tldExists('google.com')); // true

55

console.log(tldjs.tldExists('google.local')); // false

56

57

// Extract specific parts

58

console.log(tldjs.getDomain('fr.google.com')); // 'google.com'

59

console.log(tldjs.getSubdomain('fr.google.com')); // 'fr'

60

console.log(tldjs.getPublicSuffix('google.co.uk')); // 'co.uk'

61

```

62

63

## Architecture

64

65

tldjs is built around several key components:

66

67

- **Public Suffix List**: Uses Mozilla's Public Suffix List for accurate TLD recognition

68

- **Hostname Extraction**: Robust URL parsing to extract hostnames from complex URLs

69

- **Validation Layer**: RFC-compliant hostname validation

70

- **Trie Data Structure**: Efficient suffix lookup using a trie for fast public suffix matching

71

- **Factory Pattern**: Customizable instances with user-defined rules and validation hosts

72

73

## Capabilities

74

75

### URL/Hostname Parsing

76

77

Complete parsing of URLs or hostnames with all domain components extracted in a single operation.

78

79

```javascript { .api }

80

/**

81

* Parse URL/hostname and return complete information about domain components

82

* @param {string} url - URL or hostname to parse

83

* @param {number} [_step] - Internal step control for optimization

84

* @returns {ParseResult} Complete parsing result

85

*/

86

function parse(url, _step);

87

88

interface ParseResult {

89

hostname: string | null; // Extracted hostname

90

isValid: boolean; // Whether hostname is valid per RFC

91

isIp: boolean; // Whether hostname is an IP address

92

tldExists: boolean; // Whether TLD is well-known

93

publicSuffix: string | null; // Public suffix portion

94

domain: string | null; // Domain portion

95

subdomain: string | null; // Subdomain portion

96

}

97

```

98

99

**Usage Examples:**

100

101

```javascript

102

// Standard web URL

103

tldjs.parse('https://www.example.com/path');

104

// { hostname: 'www.example.com', isValid: true, isIp: false,

105

// tldExists: true, publicSuffix: 'com', domain: 'example.com', subdomain: 'www' }

106

107

// Complex AWS hostname

108

tldjs.parse('https://spark-public.s3.amazonaws.com/data.csv');

109

// { hostname: 'spark-public.s3.amazonaws.com', isValid: true, isIp: false,

110

// tldExists: true, publicSuffix: 's3.amazonaws.com',

111

// domain: 'spark-public.s3.amazonaws.com', subdomain: '' }

112

113

// IP address

114

tldjs.parse('https://192.168.0.1/admin');

115

// { hostname: '192.168.0.1', isValid: true, isIp: true,

116

// tldExists: false, publicSuffix: null, domain: null, subdomain: null }

117

118

// Invalid/unknown TLD

119

tldjs.parse('domain.unknown');

120

// { hostname: 'domain.unknown', isValid: true, isIp: false,

121

// tldExists: false, publicSuffix: 'unknown', domain: 'domain.unknown', subdomain: '' }

122

```

123

124

### TLD Existence Checking

125

126

Validates whether a TLD is well-known according to the Public Suffix List.

127

128

```javascript { .api }

129

/**

130

* Check if TLD exists for given URL/hostname

131

* @param {string} url - URL or hostname to check

132

* @returns {boolean} True if TLD is well-known

133

*/

134

function tldExists(url);

135

```

136

137

**Usage Examples:**

138

139

```javascript

140

tldjs.tldExists('google.com'); // true

141

tldjs.tldExists('google.local'); // false (not registered TLD)

142

tldjs.tldExists('com'); // true

143

tldjs.tldExists('uk'); // true

144

tldjs.tldExists('co.uk'); // true

145

tldjs.tldExists('amazon.co.uk'); // true (because 'uk' is valid)

146

tldjs.tldExists('https://user:password@example.co.uk:8080/path'); // true

147

```

148

149

### Public Suffix Extraction

150

151

Extracts the public suffix (effective TLD) from URLs or hostnames.

152

153

```javascript { .api }

154

/**

155

* Extract public suffix from URL/hostname

156

* @param {string} url - URL or hostname to analyze

157

* @returns {string | null} Public suffix or null if invalid

158

*/

159

function getPublicSuffix(url);

160

```

161

162

**Usage Examples:**

163

164

```javascript

165

tldjs.getPublicSuffix('google.com'); // 'com'

166

tldjs.getPublicSuffix('fr.google.com'); // 'com'

167

tldjs.getPublicSuffix('google.co.uk'); // 'co.uk'

168

tldjs.getPublicSuffix('s3.amazonaws.com'); // 's3.amazonaws.com'

169

tldjs.getPublicSuffix('tld.is.unknown'); // 'unknown'

170

```

171

172

### Domain Extraction

173

174

Extracts the domain (second-level domain + public suffix) from URLs or hostnames.

175

176

```javascript { .api }

177

/**

178

* Extract domain from URL/hostname

179

* @param {string} url - URL or hostname to analyze

180

* @returns {string | null} Domain or null if invalid

181

*/

182

function getDomain(url);

183

```

184

185

**Usage Examples:**

186

187

```javascript

188

tldjs.getDomain('google.com'); // 'google.com'

189

tldjs.getDomain('fr.google.com'); // 'google.com'

190

tldjs.getDomain('fr.google.google'); // 'google.google'

191

tldjs.getDomain('foo.google.co.uk'); // 'google.co.uk'

192

tldjs.getDomain('t.co'); // 't.co'

193

tldjs.getDomain('fr.t.co'); // 't.co'

194

tldjs.getDomain('https://user:password@example.co.uk:8080/some/path?query#hash'); // 'example.co.uk'

195

```

196

197

### Subdomain Extraction

198

199

Extracts the subdomain portion from URLs or hostnames.

200

201

```javascript { .api }

202

/**

203

* Extract subdomain from URL/hostname

204

* @param {string} url - URL or hostname to analyze

205

* @returns {string | null} Subdomain, empty string if none, or null if invalid

206

*/

207

function getSubdomain(url);

208

```

209

210

**Usage Examples:**

211

212

```javascript

213

tldjs.getSubdomain('google.com'); // ''

214

tldjs.getSubdomain('fr.google.com'); // 'fr'

215

tldjs.getSubdomain('google.co.uk'); // ''

216

tldjs.getSubdomain('foo.google.co.uk'); // 'foo'

217

tldjs.getSubdomain('moar.foo.google.co.uk'); // 'moar.foo'

218

tldjs.getSubdomain('t.co'); // ''

219

tldjs.getSubdomain('fr.t.co'); // 'fr'

220

tldjs.getSubdomain('https://secure.example.co.uk:443/path'); // 'secure'

221

```

222

223

### Hostname Extraction

224

225

Extracts and validates hostname from URLs or validates existing hostnames.

226

227

```javascript { .api }

228

/**

229

* Extract hostname from URL or validate hostname

230

* @param {string} url - URL or hostname to process

231

* @returns {string | null} Clean hostname or null if invalid

232

*/

233

function extractHostname(url);

234

```

235

236

**Usage Examples:**

237

238

```javascript

239

tldjs.extractHostname(' example.CO.uk '); // 'example.co.uk'

240

tldjs.extractHostname('example.co.uk/some/path'); // 'example.co.uk'

241

tldjs.extractHostname('user:password@example.co.uk:8080/path'); // 'example.co.uk'

242

tldjs.extractHostname('https://www.example.com/'); // 'www.example.com'

243

tldjs.extractHostname('台灣'); // 'xn--kpry57d' (punycode)

244

tldjs.extractHostname(42); // '42' (returns stringified input if invalid)

245

```

246

247

### Hostname Validation

248

249

Validates hostnames according to RFC 1035 standards.

250

251

```javascript { .api }

252

/**

253

* Validate hostname according to RFC 1035

254

* @param {string} hostname - Hostname to validate

255

* @returns {boolean} True if hostname is valid per RFC

256

*/

257

function isValidHostname(hostname);

258

```

259

260

**Usage Examples:**

261

262

```javascript

263

tldjs.isValidHostname('google.com'); // true

264

tldjs.isValidHostname('.google.com'); // false

265

tldjs.isValidHostname('my.fake.domain'); // true

266

tldjs.isValidHostname('localhost'); // false

267

tldjs.isValidHostname('192.168.0.0'); // true

268

tldjs.isValidHostname('https://example.com'); // false (full URL, not hostname)

269

```

270

271

### Deprecated: isValid

272

273

Legacy hostname validation function (use isValidHostname instead).

274

275

```javascript { .api }

276

/**

277

* @deprecated Use isValidHostname instead

278

* Validate hostname according to RFC 1035

279

* @param {string} hostname - Hostname to validate

280

* @returns {boolean} True if hostname is valid per RFC

281

*/

282

function isValid(hostname);

283

```

284

285

### Custom Configuration Factory

286

287

Creates customized tldjs instances with user-defined settings for specialized use cases.

288

289

```javascript { .api }

290

/**

291

* Create customized tldjs instance with user settings

292

* @param {FactoryOptions} options - Configuration options

293

* @returns {tldjs} Customized tldjs instance with same API

294

*/

295

function fromUserSettings(options);

296

297

interface FactoryOptions {

298

rules?: SuffixTrie; // Custom suffix trie for lookups

299

validHosts?: string[]; // Additional hosts to treat as valid domains

300

extractHostname?: (url: string) => string | null; // Custom hostname extraction function

301

}

302

```

303

304

**Usage Examples:**

305

306

```javascript

307

// Default behavior - localhost is not recognized

308

tldjs.getDomain('localhost'); // null

309

tldjs.getSubdomain('vhost.localhost'); // null

310

311

// Custom instance with localhost support

312

const myTldjs = tldjs.fromUserSettings({

313

validHosts: ['localhost']

314

});

315

316

myTldjs.getDomain('localhost'); // 'localhost'

317

myTldjs.getSubdomain('vhost.localhost'); // 'vhost'

318

myTldjs.getDomain('api.localhost'); // 'localhost'

319

myTldjs.getSubdomain('api.localhost'); // 'api'

320

```

321

322

## Types

323

324

```javascript { .api }

325

interface ParseResult {

326

hostname: string | null; // Extracted hostname from input

327

isValid: boolean; // Whether hostname follows RFC 1035

328

isIp: boolean; // Whether hostname is IPv4/IPv6 address

329

tldExists: boolean; // Whether TLD exists in Public Suffix List

330

publicSuffix: string | null; // Public suffix (effective TLD)

331

domain: string | null; // Domain name (SLD + public suffix)

332

subdomain: string | null; // Subdomain portion

333

}

334

335

class SuffixTrie {

336

constructor(rules?: PlainRules); // Create trie with optional rules

337

static fromJson(json: object): SuffixTrie; // Create trie from JSON rules

338

hasTld(value: string): boolean; // Check if TLD exists in trie

339

suffixLookup(hostname: string): string | null; // Find public suffix for hostname

340

exceptions: object; // Exception rules trie

341

rules: object; // Standard rules trie

342

}

343

344

interface PlainRules {

345

parts: string[]; // Domain parts in reverse order

346

exception: boolean; // Whether this is an exception rule

347

}[]

348

```

349

350

## Error Handling

351

352

All tldjs functions handle invalid input gracefully:

353

354

- Invalid URLs return `null` for extracted components

355

- Malformed hostnames are detected via `isValid: false` in parse results

356

- IP addresses are properly identified and bypass TLD validation

357

- Unknown TLDs are handled transparently (marked as `tldExists: false`)

358

359

## Performance Notes

360

361

tldjs is optimized for performance with different input types:

362

363

- **Cleaned hostnames**: ~850,000-8,700,000 ops/sec depending on function

364

- **Full URLs**: ~230,000-25,400,000 ops/sec depending on function

365

- **Lazy evaluation**: The `parse()` function uses early termination to avoid unnecessary processing

366

- **Custom hostname extraction**: You can provide optimized `extractHostname` functions for specialized use cases

367

368

## Browser Compatibility

369

370

tldjs works in browsers via bundlers like browserify, webpack, and others. The library has no Node.js-specific dependencies and uses only standard JavaScript features.

371

372

## TLD List Updates

373

374

The library bundles Mozilla's Public Suffix List but supports updates:

375

376

```bash

377

# Update TLD rules during installation

378

npm install tldjs --tldjs-update-rules

379

380

# Update existing installation

381

npm install --tldjs-update-rules

382

```