Tessl Tile for pypi/tldextract@5.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli.md configurable-extraction.md index.md result-processing.md url-extraction.md

result-processing.mddocs/

0
# Result Processing
1

2
Comprehensive result handling through the `ExtractResult` dataclass, providing properties and methods for reconstructing domains, handling IP addresses, accessing metadata, and working with parsed URL components in various formats.
3

4
## Capabilities
5

6
### ExtractResult Structure
7

8
The core data structure returned by all extraction operations, containing the parsed URL components and metadata.
9

10
```python { .api }
11
from dataclasses import dataclass, field
12

13
@dataclass(order=True)
14
class ExtractResult:
15
    subdomain: str
16
    """All subdomains beneath the domain, empty string if none"""
17
    
18
    domain: str
19
    """The topmost domain name, or hostname-like content if no valid domain"""
20
    
21
    suffix: str
22
    """The public suffix (TLD), empty string if none or invalid"""
23
    
24
    is_private: bool
25
    """Whether the suffix belongs to PSL private domains"""
26
    
27
    registry_suffix: str = field(repr=False)
28
    """The registry suffix, unaffected by include_psl_private_domains setting"""
29
```
30

31
**Basic Usage:**
32

33
```python
34
import tldextract
35

36
result = tldextract.extract('http://forums.news.cnn.com/')
37
print(f"Subdomain: '{result.subdomain}'")  # 'forums.news'
38
print(f"Domain: '{result.domain}'")        # 'cnn'
39
print(f"Suffix: '{result.suffix}'")        # 'com'
40
print(f"Is Private: {result.is_private}")  # False
41
```
42

43
### Domain Reconstruction
44

45
Properties for reconstructing various forms of the original domain name from the parsed components.
46

47
```python { .api }
48
@property
49
def fqdn(self) -> str:
50
    """
51
    Fully Qualified Domain Name if there is a proper domain and suffix.
52
    
53
    Returns:
54
    Complete domain name or empty string if invalid
55
    """
56

57
@property
58
def top_domain_under_public_suffix(self) -> str:
59
    """
60
    Domain and suffix joined with a dot if both are present.
61
    
62
    Returns:
63
    Registered domain name or empty string if invalid
64
    """
65

66
@property
67
def top_domain_under_registry_suffix(self) -> str:
68
    """
69
    Top domain under registry suffix, handling PSL private domains.
70
    
71
    Returns:
72
    Registry domain name or empty string if invalid
73
    """
74

75
@property
76
def registered_domain(self) -> str:
77
    """
78
    DEPRECATED: Use top_domain_under_public_suffix instead.
79
    
80
    Returns:
81
    Same as top_domain_under_public_suffix
82
    """
83
```
84

85
**Usage Examples:**
86

87
```python
88
import tldextract
89

90
# Standard domain reconstruction
91
result = tldextract.extract('http://forums.bbc.co.uk/path')
92
print(result.fqdn)  # 'forums.bbc.co.uk'
93
print(result.top_domain_under_public_suffix)  # 'bbc.co.uk'
94

95
# No subdomain
96
result = tldextract.extract('google.com')
97
print(result.fqdn)  # 'google.com'
98
print(result.top_domain_under_public_suffix)  # 'google.com'
99

100
# Invalid domain (IP address)
101
result = tldextract.extract('http://127.0.0.1:8080')
102
print(result.fqdn)  # '' (empty string)
103
print(result.top_domain_under_public_suffix)  # '' (empty string)
104

105
# Private domain handling
106
result = tldextract.extract('waiterrant.blogspot.com', include_psl_private_domains=True)
107
print(result.top_domain_under_public_suffix)  # 'waiterrant.blogspot.com'
108
print(result.top_domain_under_registry_suffix)  # 'blogspot.com'
109
```
110

111
### IP Address Detection
112

113
Properties for detecting and extracting IP addresses from the parsed results.
114

115
```python { .api }
116
@property
117
def ipv4(self) -> str:
118
    """
119
    IPv4 address if input was a valid IPv4, empty string otherwise.
120
    
121
    Returns:
122
    IPv4 address string or empty string
123
    """
124

125
@property
126
def ipv6(self) -> str:
127
    """
128
    IPv6 address if input was a valid IPv6, empty string otherwise.
129
    
130
    Returns:
131
    IPv6 address string or empty string
132
    """
133
```
134

135
**Usage Examples:**
136

137
```python
138
import tldextract
139

140
# IPv4 detection
141
result = tldextract.extract('http://192.168.1.1:8080/path')
142
print(result.ipv4)  # '192.168.1.1'
143
print(result.ipv6)  # ''
144
print(result.domain)  # '192.168.1.1'
145
print(result.suffix)  # ''
146

147
# IPv6 detection
148
result = tldextract.extract('http://[2001:db8::1]/path')
149
print(result.ipv4)  # ''
150
print(result.ipv6)  # '2001:db8::1'
151
print(result.domain)  # '[2001:db8::1]'
152

153
# Invalid IP addresses
154
result = tldextract.extract('http://256.1.1.1/')  # Invalid IPv4
155
print(result.ipv4)  # ''
156
print(result.domain)  # '256.1.1.1'
157

158
result = tldextract.extract('http://127.0.0.1.1/')  # Invalid format
159
print(result.ipv4)  # ''
160
print(result.domain)  # '127.0.0.1.1'
161
```
162

163
### Domain Name Formatting
164

165
Property for converting domain names to reverse DNS notation, commonly used in package naming and namespace organization.
166

167
```python { .api }
168
@property
169
def reverse_domain_name(self) -> str:
170
    """
171
    Domain name in reverse DNS notation.
172
    
173
    Joins components as: suffix.domain.reversed_subdomain_parts
174
    
175
    Returns:
176
    Reverse domain name string
177
    """
178
```
179

180
**Usage Examples:**
181

182
```python
183
import tldextract
184

185
# Simple domain
186
result = tldextract.extract('login.example.com')
187
print(result.reverse_domain_name)  # 'com.example.login'
188

189
# Complex subdomain
190
result = tldextract.extract('api.v2.auth.example.com')
191
print(result.reverse_domain_name)  # 'com.example.auth.v2.api'
192

193
# Country code TLD
194
result = tldextract.extract('login.example.co.uk')
195
print(result.reverse_domain_name)  # 'co.uk.example.login'
196

197
# No subdomain
198
result = tldextract.extract('example.com')
199
print(result.reverse_domain_name)  # 'com.example'
200
```
201

202
## Private Domain Handling
203

204
Understanding how PSL private domains affect the result structure and property values.
205

206
### Default Behavior (include_psl_private_domains=False)
207

208
```python
209
import tldextract
210

211
# Default: private domains treated as regular domains
212
result = tldextract.extract('waiterrant.blogspot.com')
213
print(result.subdomain)  # 'waiterrant'
214
print(result.domain)     # 'blogspot'
215
print(result.suffix)     # 'com'
216
print(result.is_private) # False
217
print(result.registry_suffix)  # 'com'
218
print(result.top_domain_under_public_suffix)    # 'blogspot.com'
219
print(result.top_domain_under_registry_suffix)  # 'blogspot.com'
220
```
221

222
### Private Domains Enabled (include_psl_private_domains=True)
223

224
```python
225
import tldextract
226

227
# Private domains included in suffix
228
result = tldextract.extract('waiterrant.blogspot.com', include_psl_private_domains=True)
229
print(result.subdomain)  # ''
230
print(result.domain)     # 'waiterrant'
231
print(result.suffix)     # 'blogspot.com'
232
print(result.is_private) # True
233
print(result.registry_suffix)  # 'com'
234
print(result.top_domain_under_public_suffix)    # 'waiterrant.blogspot.com'
235
print(result.top_domain_under_registry_suffix)  # 'blogspot.com'
236
```
237

238
## Edge Cases and Special Handling
239

240
### Invalid Suffixes
241

242
When the input domain doesn't have a recognized public suffix:
243

244
```python
245
import tldextract
246

247
result = tldextract.extract('google.notavalidsuffix')
248
print(result.subdomain)  # 'google'
249
print(result.domain)     # 'notavalidsuffix'
250
print(result.suffix)     # ''
251
print(result.fqdn)       # ''
252
```
253

254
### Localhost and Private Networks
255

256
```python
257
import tldextract
258

259
result = tldextract.extract('http://localhost:8080')
260
print(result.subdomain)  # ''
261
print(result.domain)     # 'localhost'
262
print(result.suffix)     # ''
263
print(result.fqdn)       # ''
264

265
result = tldextract.extract('http://intranet.corp')
266
print(result.subdomain)  # 'intranet'
267
print(result.domain)     # 'corp'
268
print(result.suffix)     # ''
269
```
270

271
### Punycode/IDN Domains
272

273
International domain names are automatically handled:
274

275
```python
276
import tldextract
277

278
# Punycode is automatically decoded internally
279
result = tldextract.extract('http://xn--n3h.com')  # ☃.com
280
print(result.domain)  # Handled correctly
281

282
# Unicode domains work directly
283
result = tldextract.extract('http://münchen.de')
284
print(result.domain)     # 'münchen'
285
print(result.suffix)     # 'de'
286
```
287

288
## Comparison and Sorting
289

290
`ExtractResult` objects support comparison and sorting operations:
291

292
```python
293
import tldextract
294

295
results = [
296
    tldextract.extract('b.example.com'),
297
    tldextract.extract('a.example.com'),
298
    tldextract.extract('c.example.org')
299
]
300

301
# Results are sortable (order=True in dataclass)
302
sorted_results = sorted(results)
303
for result in sorted_results:
304
    print(result.fqdn)
305
# Output will be in lexicographic order
306

307
# Equality comparison
308
result1 = tldextract.extract('example.com')
309
result2 = tldextract.extract('http://example.com/')
310
print(result1 == result2)  # True - same parsed components
311
```
312

313
## String Representation
314

315
`ExtractResult` provides readable string representation:
316

317
```python
318
import tldextract
319

320
result = tldextract.extract('http://forums.news.cnn.com/')
321
print(result)
322
# ExtractResult(subdomain='forums.news', domain='cnn', suffix='com', is_private=False)
323

324
print(repr(result))
325
# Same detailed representation
326
```

Version

Tile

Files

result-processing.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

result-processing.mddocs/