0
# YARL - Yet Another URL Library
1
2
A comprehensive Python library for URL parsing, manipulation, and construction that provides a clean, immutable API for working with URLs. YARL offers performance-optimized URL handling with comprehensive support for all URL components, query parameters, path operations, and encoding/decoding according to web standards.
3
4
## Package Information
5
6
- **Package Name**: yarl
7
- **Language**: Python
8
- **Installation**: `pip install yarl`
9
- **Python Requirements**: >=3.9
10
- **Dependencies**: idna >= 2.0, multidict >= 4.0, propcache >= 0.2.1
11
12
## Core Imports
13
14
```python
15
from yarl import URL
16
```
17
18
For query handling:
19
20
```python
21
from yarl import URL, Query, SimpleQuery, QueryVariable
22
```
23
24
For accessing query results:
25
26
```python
27
from multidict import MultiDictProxy # URL.query returns MultiDictProxy[str]
28
```
29
30
For cache management:
31
32
```python
33
from yarl import URL, cache_clear, cache_configure, cache_info
34
```
35
36
## Basic Usage
37
38
```python
39
from yarl import URL
40
41
# Create URLs from strings
42
url = URL('https://user:password@example.com:8080/path/to/resource?query=value&multi=1&multi=2#fragment')
43
44
# Access URL components
45
print(url.scheme) # 'https'
46
print(url.host) # 'example.com'
47
print(url.port) # 8080
48
print(url.path) # '/path/to/resource'
49
print(url.query) # MultiDictProxy with query parameters
50
print(url.fragment) # 'fragment'
51
52
# URLs are immutable - modifications return new instances
53
new_url = url.with_host('api.example.com').with_port(443)
54
path_url = url / 'subpath' / 'file.txt'
55
query_url = url % {'new_param': 'value'}
56
57
# Build URLs from components
58
built_url = URL.build(
59
scheme='https',
60
host='api.example.com',
61
port=8080,
62
path='/v1/users',
63
query={'limit': 10, 'offset': 0}
64
)
65
66
# Path operations
67
parent = url.parent
68
filename = url.name
69
extension = url.suffix
70
71
# Human-readable representation
72
print(url.human_repr()) # URL with decoded non-ASCII characters
73
```
74
75
## Architecture
76
77
YARL provides an immutable URL class with comprehensive component access and manipulation capabilities:
78
79
- **Immutable Design**: All URL operations return new instances, ensuring thread safety
80
- **Component Access**: Properties for all URL parts (scheme, host, port, path, query, fragment)
81
- **Encoding Handling**: Automatic encoding/decoding with proper URL quoting rules
82
- **Performance Optimization**: LRU caching for encoding/decoding operations with optional Cython extensions
83
- **Standards Compliance**: Full adherence to URL parsing and encoding standards
84
- **MultiDict Integration**: Query parameters handled via multidict for multiple values per key
85
86
## Capabilities
87
88
### Core URL Manipulation
89
90
Core URL construction, parsing, and component access functionality. Provides the main URL class with properties for accessing all URL components and methods for URL validation and comparison.
91
92
```python { .api }
93
class URL:
94
def __new__(cls, val: str) -> "URL": ...
95
96
@classmethod
97
def build(cls, *, scheme: str = "", authority: str = "", user: str = None,
98
password: str = None, host: str = "", port: int = None,
99
path: str = "", query: Query = None, query_string: str = "",
100
fragment: str = "", encoded: bool = False) -> "URL": ...
101
102
# Properties
103
scheme: str
104
host: str | None
105
port: int | None
106
path: str
107
query: MultiDictProxy[str]
108
fragment: str
109
110
# State checking
111
def is_absolute(self) -> bool: ...
112
def is_default_port(self) -> bool: ...
113
```
114
115
[Core URL Manipulation](./core-url.md)
116
117
### URL Modification Operations
118
119
Methods for creating modified versions of URLs by changing individual components. All methods return new URL instances as URLs are immutable.
120
121
```python { .api }
122
def with_scheme(self, scheme: str) -> "URL": ...
123
def with_host(self, host: str) -> "URL": ...
124
def with_port(self, port: int | None) -> "URL": ...
125
def with_path(self, path: str, *, encoded: bool = False,
126
keep_query: bool = False, keep_fragment: bool = False) -> "URL": ...
127
def with_fragment(self, fragment: str | None) -> "URL": ...
128
```
129
130
[URL Modification](./url-modification.md)
131
132
### Query Parameter Handling
133
134
Comprehensive query string manipulation with support for multiple values per parameter, various input formats, and MultiDict integration for robust parameter handling.
135
136
```python { .api }
137
def with_query(self, query: Query = None, **kwargs: QueryVariable) -> "URL": ...
138
def extend_query(self, query: Query = None, **kwargs: QueryVariable) -> "URL": ...
139
def update_query(self, query: Query = None, **kwargs: QueryVariable) -> "URL": ...
140
def without_query_params(self, *query_params: str) -> "URL": ...
141
142
# Type definitions
143
SimpleQuery = Union[str, SupportsInt, float]
144
QueryVariable = Union[SimpleQuery, Sequence[SimpleQuery]]
145
Query = Union[None, str, Mapping[str, QueryVariable], Sequence[tuple[str, QueryVariable]]]
146
```
147
148
[Query Parameter Handling](./query-handling.md)
149
150
### Path Operations
151
152
Path manipulation, normalization, and joining operations for URL path components, including filename and extension handling.
153
154
```python { .api }
155
def joinpath(self, *other: str, encoded: bool = False) -> "URL": ...
156
def join(self, url: "URL") -> "URL": ...
157
def with_name(self, name: str, *, keep_query: bool = False,
158
keep_fragment: bool = False) -> "URL": ...
159
def with_suffix(self, suffix: str, *, keep_query: bool = False,
160
keep_fragment: bool = False) -> "URL": ...
161
162
# Path properties
163
parent: "URL"
164
name: str
165
suffix: str
166
suffixes: tuple[str, ...]
167
parts: tuple[str, ...]
168
```
169
170
[Path Operations](./path-operations.md)
171
172
### Cache Management
173
174
Performance optimization through configurable LRU caching for encoding/decoding operations, particularly beneficial for applications processing many URLs.
175
176
```python { .api }
177
def cache_clear() -> None: ...
178
def cache_info() -> CacheInfo: ...
179
def cache_configure(*, idna_encode_size: int = 256, idna_decode_size: int = 256,
180
ip_address_size: int = 256, host_validate_size: int = 256,
181
encode_host_size: int = 256) -> None: ...
182
183
class CacheInfo(TypedDict):
184
idna_encode: _CacheInfo
185
idna_decode: _CacheInfo
186
ip_address: _CacheInfo
187
host_validate: _CacheInfo
188
encode_host: _CacheInfo
189
```
190
191
[Cache Management](./cache-management.md)
192
193
## Error Handling
194
195
YARL raises standard Python exceptions for various error conditions:
196
197
- **TypeError**: When arguments have incorrect types (e.g., non-string URL values)
198
- **ValueError**: For invalid values such as `float('inf')` or `float('nan')` in query parameters
199
- **UnicodeError**: During encoding/decoding of international domain names (IDN)
200
201
Error handling is straightforward with standard Python exception handling patterns.