or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-url.mdfragment.mdindex.mdpath.mdquery.mdutilities.md

utilities.mddocs/

0

# Utility Functions

1

2

Furl provides a comprehensive set of utility functions for URL validation, encoding, parsing, and manipulation operations. These functions can be used independently or as part of the main furl functionality.

3

4

## Capabilities

5

6

### URL Parsing and Splitting

7

8

Enhanced URL parsing functions that extend Python's standard urllib functionality.

9

10

```python { .api }

11

def urlsplit(url):

12

"""

13

Split URL into components with enhanced parsing.

14

15

Args:

16

url (str): URL string to split

17

18

Returns:

19

SplitResult: Named tuple with URL components

20

"""

21

22

def urljoin(base, url):

23

"""

24

Join base URL with relative URL using enhanced logic.

25

26

Args:

27

base (str): Base URL string

28

url (str): URL string to join (can be relative or absolute)

29

30

Returns:

31

str: Joined URL string

32

"""

33

```

34

35

**Usage:**

36

37

```python

38

from furl import urlsplit, urljoin

39

40

# Enhanced URL splitting

41

result = urlsplit('https://user:pass@example.com:8080/path?query=value#fragment')

42

print(result.scheme) # 'https'

43

print(result.netloc) # 'user:pass@example.com:8080'

44

print(result.path) # '/path'

45

print(result.query) # 'query=value'

46

print(result.fragment) # 'fragment'

47

48

# URL joining

49

base = 'https://example.com/api/v1/'

50

endpoint = 'users/123'

51

full_url = urljoin(base, endpoint)

52

print(full_url) # 'https://example.com/api/v1/users/123'

53

54

# Join with absolute URL (replaces base)

55

absolute = 'https://different.com/other'

56

result = urljoin(base, absolute)

57

print(result) # 'https://different.com/other'

58

```

59

60

### Scheme Manipulation

61

62

Functions for extracting, validating, and manipulating URL schemes.

63

64

```python { .api }

65

def get_scheme(url):

66

"""

67

Extract scheme from URL string.

68

69

Args:

70

url (str): URL string

71

72

Returns:

73

str | None: Scheme string or None if no scheme

74

"""

75

76

def strip_scheme(url):

77

"""

78

Remove scheme from URL string.

79

80

Args:

81

url (str): URL string

82

83

Returns:

84

str: URL string without scheme

85

"""

86

87

def set_scheme(url, scheme):

88

"""

89

Set or replace scheme in URL string.

90

91

Args:

92

url (str): URL string

93

scheme (str): New scheme to set

94

95

Returns:

96

str: URL string with new scheme

97

"""

98

99

def is_valid_scheme(scheme):

100

"""

101

Validate URL scheme format.

102

103

Args:

104

scheme (str): Scheme string to validate

105

106

Returns:

107

bool: True if scheme is valid

108

"""

109

110

def has_netloc(url):

111

"""

112

Check if URL has network location component.

113

114

Args:

115

url (str): URL string to check

116

117

Returns:

118

bool: True if URL has netloc

119

"""

120

```

121

122

**Usage:**

123

124

```python

125

from furl import get_scheme, strip_scheme, set_scheme, is_valid_scheme, has_netloc

126

127

url = 'https://example.com/path'

128

129

# Extract scheme

130

scheme = get_scheme(url)

131

print(scheme) # 'https'

132

133

# Remove scheme

134

no_scheme = strip_scheme(url)

135

print(no_scheme) # '//example.com/path'

136

137

# Set new scheme

138

ftp_url = set_scheme(url, 'ftp')

139

print(ftp_url) # 'ftp://example.com/path'

140

141

# Validate scheme

142

print(is_valid_scheme('https')) # True

143

print(is_valid_scheme('ht-tps')) # False (invalid characters)

144

145

# Check for network location

146

print(has_netloc('https://example.com/path')) # True

147

print(has_netloc('/just/a/path')) # False

148

```

149

150

### Path Manipulation Utilities

151

152

Functions for manipulating URL path segments and components.

153

154

```python { .api }

155

def join_path_segments(*args):

156

"""

157

Join multiple path segments into a single path.

158

159

Args:

160

*args: Path segments to join (strings)

161

162

Returns:

163

str: Joined path string

164

"""

165

166

def remove_path_segments(segments, remove):

167

"""

168

Remove specified segments from path segments list.

169

170

Args:

171

segments (list): List of path segments

172

remove (list|str): Segments to remove

173

174

Returns:

175

list: Updated segments list

176

"""

177

178

def quacks_like_a_path_with_segments(obj):

179

"""

180

Duck typing check for path-like objects with segments.

181

182

Args:

183

obj: Object to check

184

185

Returns:

186

bool: True if object behaves like a path with segments

187

"""

188

```

189

190

**Usage:**

191

192

```python

193

from furl import join_path_segments, remove_path_segments, quacks_like_a_path_with_segments

194

195

# Join path segments

196

path = join_path_segments('api', 'v1', 'users', '123')

197

print(path) # 'api/v1/users/123'

198

199

# Remove segments from list

200

segments = ['api', 'v1', 'users', '123', 'profile']

201

updated = remove_path_segments(segments, ['v1', 'profile'])

202

print(updated) # ['api', 'users', '123']

203

204

# Duck typing check

205

from furl import Path

206

path_obj = Path('/api/v1/users')

207

print(quacks_like_a_path_with_segments(path_obj)) # True

208

print(quacks_like_a_path_with_segments("string")) # False

209

```

210

211

### Validation Functions

212

213

Functions for validating various URL components and formats.

214

215

```python { .api }

216

def is_valid_host(hostname):

217

"""

218

Validate hostname format.

219

220

Args:

221

hostname (str): Hostname to validate

222

223

Returns:

224

bool: True if hostname is valid

225

"""

226

227

def is_valid_port(port):

228

"""

229

Validate port number.

230

231

Args:

232

port (int|str): Port number to validate

233

234

Returns:

235

bool: True if port is valid (1-65535)

236

"""

237

238

def is_valid_encoded_path_segment(segment):

239

"""

240

Validate percent-encoded path segment.

241

242

Args:

243

segment (str): Path segment to validate

244

245

Returns:

246

bool: True if segment is properly encoded

247

"""

248

249

def is_valid_encoded_query_key(key):

250

"""

251

Validate percent-encoded query parameter key.

252

253

Args:

254

key (str): Query key to validate

255

256

Returns:

257

bool: True if key is properly encoded

258

"""

259

260

def is_valid_encoded_query_value(value):

261

"""

262

Validate percent-encoded query parameter value.

263

264

Args:

265

value (str): Query value to validate

266

267

Returns:

268

bool: True if value is properly encoded

269

"""

270

```

271

272

**Usage:**

273

274

```python

275

from furl import (is_valid_host, is_valid_port, is_valid_encoded_path_segment,

276

is_valid_encoded_query_key, is_valid_encoded_query_value)

277

278

# Validate hostname

279

print(is_valid_host('example.com')) # True

280

print(is_valid_host('sub.example.com')) # True

281

print(is_valid_host('192.168.1.1')) # True

282

print(is_valid_host('invalid..host')) # False

283

284

# Validate port

285

print(is_valid_port(80)) # True

286

print(is_valid_port('443')) # True

287

print(is_valid_port(0)) # False

288

print(is_valid_port(99999)) # False

289

290

# Validate encoded components

291

print(is_valid_encoded_path_segment('users')) # True

292

print(is_valid_encoded_path_segment('user%20name')) # True

293

print(is_valid_encoded_path_segment('user name')) # False (not encoded)

294

295

print(is_valid_encoded_query_key('search_term')) # True

296

print(is_valid_encoded_query_key('search%20term')) # True

297

298

print(is_valid_encoded_query_value('hello%20world')) # True

299

print(is_valid_encoded_query_value('hello world')) # False

300

```

301

302

### Encoding and Decoding Utilities

303

304

Functions for handling character encoding and IDNA (Internationalized Domain Names).

305

306

```python { .api }

307

def utf8(obj, default=None):

308

"""

309

Convert object to UTF-8 encoded string.

310

311

Args:

312

obj: Object to convert

313

default: Default value if conversion fails

314

315

Returns:

316

str: UTF-8 encoded string

317

"""

318

319

def idna_encode(hostname):

320

"""

321

Encode hostname using IDNA (Internationalized Domain Names).

322

323

Args:

324

hostname (str): Hostname to encode

325

326

Returns:

327

str: IDNA encoded hostname

328

"""

329

330

def idna_decode(hostname):

331

"""

332

Decode IDNA encoded hostname.

333

334

Args:

335

hostname (str): IDNA encoded hostname

336

337

Returns:

338

str: Decoded hostname

339

"""

340

341

def attemptstr(obj):

342

"""

343

Attempt to convert object to string.

344

345

Args:

346

obj: Object to convert

347

348

Returns:

349

str: String representation or original object

350

"""

351

352

def non_string_iterable(obj):

353

"""

354

Check if object is iterable but not a string.

355

356

Args:

357

obj: Object to check

358

359

Returns:

360

bool: True if iterable but not string

361

"""

362

```

363

364

**Usage:**

365

366

```python

367

from furl import utf8, idna_encode, idna_decode, attemptstr, non_string_iterable

368

369

# UTF-8 encoding

370

text = utf8('Hello 世界')

371

print(text) # Properly encoded UTF-8 string

372

373

# IDNA encoding for international domain names

374

international_domain = 'тест.example'

375

encoded = idna_encode(international_domain)

376

print(encoded) # 'xn--e1aybc.example'

377

378

# IDNA decoding

379

decoded = idna_decode(encoded)

380

print(decoded) # 'тест.example'

381

382

# String conversion

383

print(attemptstr(123)) # '123'

384

print(attemptstr(['a'])) # "['a']"

385

386

# Check for non-string iterables

387

print(non_string_iterable(['a', 'b'])) # True

388

print(non_string_iterable('string')) # False

389

print(non_string_iterable(123)) # False

390

```

391

392

### Helper Functions

393

394

Additional utility functions for common operations.

395

396

```python { .api }

397

def lget(lst, index, default=None):

398

"""

399

Safe list index access with default value.

400

401

Args:

402

lst (list): List to access

403

index (int): Index to access

404

default: Default value if index out of bounds

405

406

Returns:

407

Any: List item or default value

408

"""

409

410

def static_vars(**kwargs):

411

"""

412

Decorator to add static variables to functions.

413

414

Args:

415

**kwargs: Static variables to add

416

417

Returns:

418

function: Decorated function with static variables

419

"""

420

421

def create_quote_fn(safe_charset, quote_plus):

422

"""

423

Create custom URL quoting function.

424

425

Args:

426

safe_charset (str): Characters considered safe (not to quote)

427

quote_plus (bool): Use '+' for spaces instead of '%20'

428

429

Returns:

430

function: Custom quoting function

431

"""

432

```

433

434

**Usage:**

435

436

```python

437

from furl import lget

438

439

# Safe list access

440

items = ['a', 'b', 'c']

441

print(lget(items, 1)) # 'b'

442

print(lget(items, 10)) # None

443

print(lget(items, 10, 'default')) # 'default'

444

445

# Safe access with empty list

446

empty = []

447

print(lget(empty, 0, 'fallback')) # 'fallback'

448

```

449

450

## Constants and Configuration

451

452

### Default Ports Mapping

453

454

Dictionary mapping URL schemes to their default ports.

455

456

```python { .api }

457

DEFAULT_PORTS = {

458

'http': 80,

459

'https': 443,

460

'ftp': 21,

461

'ssh': 22,

462

'telnet': 23,

463

# ... and 34 more common protocols

464

}

465

```

466

467

**Usage:**

468

469

```python

470

from furl import DEFAULT_PORTS

471

472

# Check default port for scheme

473

print(DEFAULT_PORTS.get('https')) # 443

474

print(DEFAULT_PORTS.get('ftp')) # 21

475

print(DEFAULT_PORTS.get('unknown')) # None

476

477

# List all supported schemes

478

print(list(DEFAULT_PORTS.keys()))

479

```

480

481

### Validation Patterns

482

483

Regular expression patterns used for validation.

484

485

```python { .api }

486

PERCENT_REGEX = r'\%[a-fA-F\d][a-fA-F\d]' # Pattern for percent-encoded chars

487

INVALID_HOST_CHARS = '!@#$%^&\'"*()+=:;/' # Invalid characters in hostnames

488

```

489

490

### Advanced Helper Functions

491

492

Additional functions for specialized operations.

493

494

```python { .api }

495

def static_vars(**kwargs):

496

"""

497

Decorator to add static variables to functions.

498

499

Args:

500

**kwargs: Static variables to add

501

502

Returns:

503

function: Decorated function with static variables

504

"""

505

506

def create_quote_fn(safe_charset, quote_plus):

507

"""

508

Create custom URL quoting function.

509

510

Args:

511

safe_charset (str): Characters considered safe (not to quote)

512

quote_plus (bool): Use '+' for spaces instead of '%20'

513

514

Returns:

515

function: Custom quoting function

516

"""

517

```

518

519

## Error Handling

520

521

Utility functions handle various error conditions gracefully:

522

523

- **Invalid input types**: Functions handle unexpected input types

524

- **Encoding errors**: Proper handling of Unicode and encoding issues

525

- **Malformed URLs**: Graceful handling of malformed URL strings

526

- **Network errors**: IDNA encoding/decoding errors

527

- **Validation failures**: Clear indication of validation failures

528

529

```python

530

from furl import is_valid_host, utf8

531

532

# Handle invalid input gracefully

533

print(is_valid_host(None)) # False (doesn't crash)

534

print(utf8(None, 'fallback')) # 'fallback'

535

536

# Handle encoding errors

537

try:

538

result = idna_encode('invalid..domain')

539

except Exception as e:

540

print(f"Encoding error: {e}")

541

```