0
# Result Processing
1
2
Comprehensive result handling through the `ExtractResult` dataclass, providing properties and methods for reconstructing domains, handling IP addresses, accessing metadata, and working with parsed URL components in various formats.
3
4
## Capabilities
5
6
### ExtractResult Structure
7
8
The core data structure returned by all extraction operations, containing the parsed URL components and metadata.
9
10
```python { .api }
11
from dataclasses import dataclass, field
12
13
@dataclass(order=True)
14
class ExtractResult:
15
subdomain: str
16
"""All subdomains beneath the domain, empty string if none"""
17
18
domain: str
19
"""The topmost domain name, or hostname-like content if no valid domain"""
20
21
suffix: str
22
"""The public suffix (TLD), empty string if none or invalid"""
23
24
is_private: bool
25
"""Whether the suffix belongs to PSL private domains"""
26
27
registry_suffix: str = field(repr=False)
28
"""The registry suffix, unaffected by include_psl_private_domains setting"""
29
```
30
31
**Basic Usage:**
32
33
```python
34
import tldextract
35
36
result = tldextract.extract('http://forums.news.cnn.com/')
37
print(f"Subdomain: '{result.subdomain}'") # 'forums.news'
38
print(f"Domain: '{result.domain}'") # 'cnn'
39
print(f"Suffix: '{result.suffix}'") # 'com'
40
print(f"Is Private: {result.is_private}") # False
41
```
42
43
### Domain Reconstruction
44
45
Properties for reconstructing various forms of the original domain name from the parsed components.
46
47
```python { .api }
48
@property
49
def fqdn(self) -> str:
50
"""
51
Fully Qualified Domain Name if there is a proper domain and suffix.
52
53
Returns:
54
Complete domain name or empty string if invalid
55
"""
56
57
@property
58
def top_domain_under_public_suffix(self) -> str:
59
"""
60
Domain and suffix joined with a dot if both are present.
61
62
Returns:
63
Registered domain name or empty string if invalid
64
"""
65
66
@property
67
def top_domain_under_registry_suffix(self) -> str:
68
"""
69
Top domain under registry suffix, handling PSL private domains.
70
71
Returns:
72
Registry domain name or empty string if invalid
73
"""
74
75
@property
76
def registered_domain(self) -> str:
77
"""
78
DEPRECATED: Use top_domain_under_public_suffix instead.
79
80
Returns:
81
Same as top_domain_under_public_suffix
82
"""
83
```
84
85
**Usage Examples:**
86
87
```python
88
import tldextract
89
90
# Standard domain reconstruction
91
result = tldextract.extract('http://forums.bbc.co.uk/path')
92
print(result.fqdn) # 'forums.bbc.co.uk'
93
print(result.top_domain_under_public_suffix) # 'bbc.co.uk'
94
95
# No subdomain
96
result = tldextract.extract('google.com')
97
print(result.fqdn) # 'google.com'
98
print(result.top_domain_under_public_suffix) # 'google.com'
99
100
# Invalid domain (IP address)
101
result = tldextract.extract('http://127.0.0.1:8080')
102
print(result.fqdn) # '' (empty string)
103
print(result.top_domain_under_public_suffix) # '' (empty string)
104
105
# Private domain handling
106
result = tldextract.extract('waiterrant.blogspot.com', include_psl_private_domains=True)
107
print(result.top_domain_under_public_suffix) # 'waiterrant.blogspot.com'
108
print(result.top_domain_under_registry_suffix) # 'blogspot.com'
109
```
110
111
### IP Address Detection
112
113
Properties for detecting and extracting IP addresses from the parsed results.
114
115
```python { .api }
116
@property
117
def ipv4(self) -> str:
118
"""
119
IPv4 address if input was a valid IPv4, empty string otherwise.
120
121
Returns:
122
IPv4 address string or empty string
123
"""
124
125
@property
126
def ipv6(self) -> str:
127
"""
128
IPv6 address if input was a valid IPv6, empty string otherwise.
129
130
Returns:
131
IPv6 address string or empty string
132
"""
133
```
134
135
**Usage Examples:**
136
137
```python
138
import tldextract
139
140
# IPv4 detection
141
result = tldextract.extract('http://192.168.1.1:8080/path')
142
print(result.ipv4) # '192.168.1.1'
143
print(result.ipv6) # ''
144
print(result.domain) # '192.168.1.1'
145
print(result.suffix) # ''
146
147
# IPv6 detection
148
result = tldextract.extract('http://[2001:db8::1]/path')
149
print(result.ipv4) # ''
150
print(result.ipv6) # '2001:db8::1'
151
print(result.domain) # '[2001:db8::1]'
152
153
# Invalid IP addresses
154
result = tldextract.extract('http://256.1.1.1/') # Invalid IPv4
155
print(result.ipv4) # ''
156
print(result.domain) # '256.1.1.1'
157
158
result = tldextract.extract('http://127.0.0.1.1/') # Invalid format
159
print(result.ipv4) # ''
160
print(result.domain) # '127.0.0.1.1'
161
```
162
163
### Domain Name Formatting
164
165
Property for converting domain names to reverse DNS notation, commonly used in package naming and namespace organization.
166
167
```python { .api }
168
@property
169
def reverse_domain_name(self) -> str:
170
"""
171
Domain name in reverse DNS notation.
172
173
Joins components as: suffix.domain.reversed_subdomain_parts
174
175
Returns:
176
Reverse domain name string
177
"""
178
```
179
180
**Usage Examples:**
181
182
```python
183
import tldextract
184
185
# Simple domain
186
result = tldextract.extract('login.example.com')
187
print(result.reverse_domain_name) # 'com.example.login'
188
189
# Complex subdomain
190
result = tldextract.extract('api.v2.auth.example.com')
191
print(result.reverse_domain_name) # 'com.example.auth.v2.api'
192
193
# Country code TLD
194
result = tldextract.extract('login.example.co.uk')
195
print(result.reverse_domain_name) # 'co.uk.example.login'
196
197
# No subdomain
198
result = tldextract.extract('example.com')
199
print(result.reverse_domain_name) # 'com.example'
200
```
201
202
## Private Domain Handling
203
204
Understanding how PSL private domains affect the result structure and property values.
205
206
### Default Behavior (include_psl_private_domains=False)
207
208
```python
209
import tldextract
210
211
# Default: private domains treated as regular domains
212
result = tldextract.extract('waiterrant.blogspot.com')
213
print(result.subdomain) # 'waiterrant'
214
print(result.domain) # 'blogspot'
215
print(result.suffix) # 'com'
216
print(result.is_private) # False
217
print(result.registry_suffix) # 'com'
218
print(result.top_domain_under_public_suffix) # 'blogspot.com'
219
print(result.top_domain_under_registry_suffix) # 'blogspot.com'
220
```
221
222
### Private Domains Enabled (include_psl_private_domains=True)
223
224
```python
225
import tldextract
226
227
# Private domains included in suffix
228
result = tldextract.extract('waiterrant.blogspot.com', include_psl_private_domains=True)
229
print(result.subdomain) # ''
230
print(result.domain) # 'waiterrant'
231
print(result.suffix) # 'blogspot.com'
232
print(result.is_private) # True
233
print(result.registry_suffix) # 'com'
234
print(result.top_domain_under_public_suffix) # 'waiterrant.blogspot.com'
235
print(result.top_domain_under_registry_suffix) # 'blogspot.com'
236
```
237
238
## Edge Cases and Special Handling
239
240
### Invalid Suffixes
241
242
When the input domain doesn't have a recognized public suffix:
243
244
```python
245
import tldextract
246
247
result = tldextract.extract('google.notavalidsuffix')
248
print(result.subdomain) # 'google'
249
print(result.domain) # 'notavalidsuffix'
250
print(result.suffix) # ''
251
print(result.fqdn) # ''
252
```
253
254
### Localhost and Private Networks
255
256
```python
257
import tldextract
258
259
result = tldextract.extract('http://localhost:8080')
260
print(result.subdomain) # ''
261
print(result.domain) # 'localhost'
262
print(result.suffix) # ''
263
print(result.fqdn) # ''
264
265
result = tldextract.extract('http://intranet.corp')
266
print(result.subdomain) # 'intranet'
267
print(result.domain) # 'corp'
268
print(result.suffix) # ''
269
```
270
271
### Punycode/IDN Domains
272
273
International domain names are automatically handled:
274
275
```python
276
import tldextract
277
278
# Punycode is automatically decoded internally
279
result = tldextract.extract('http://xn--n3h.com') # ☃.com
280
print(result.domain) # Handled correctly
281
282
# Unicode domains work directly
283
result = tldextract.extract('http://münchen.de')
284
print(result.domain) # 'münchen'
285
print(result.suffix) # 'de'
286
```
287
288
## Comparison and Sorting
289
290
`ExtractResult` objects support comparison and sorting operations:
291
292
```python
293
import tldextract
294
295
results = [
296
tldextract.extract('b.example.com'),
297
tldextract.extract('a.example.com'),
298
tldextract.extract('c.example.org')
299
]
300
301
# Results are sortable (order=True in dataclass)
302
sorted_results = sorted(results)
303
for result in sorted_results:
304
print(result.fqdn)
305
# Output will be in lexicographic order
306
307
# Equality comparison
308
result1 = tldextract.extract('example.com')
309
result2 = tldextract.extract('http://example.com/')
310
print(result1 == result2) # True - same parsed components
311
```
312
313
## String Representation
314
315
`ExtractResult` provides readable string representation:
316
317
```python
318
import tldextract
319
320
result = tldextract.extract('http://forums.news.cnn.com/')
321
print(result)
322
# ExtractResult(subdomain='forums.news', domain='cnn', suffix='com', is_private=False)
323
324
print(repr(result))
325
# Same detailed representation
326
```