Library of web-related functions for HTML manipulation, HTTP processing, URL handling, and encoding detection
84
HTTP header processing utilities for converting between raw header formats and dictionaries, plus HTTP Basic Authentication header generation. These functions handle the complexities of HTTP header parsing and formatting according to HTTP specifications.
Convert between raw HTTP headers (multi-line byte strings) and structured dictionaries for easy manipulation.
def headers_raw_to_dict(headers_raw):
"""
Convert raw HTTP headers to dictionary format.
Args:
headers_raw (bytes|None): Raw headers as multi-line byte string
Returns:
HeadersDictOutput|None: Dictionary mapping header names to lists of values,
or None if input is None
"""
def headers_dict_to_raw(headers_dict):
"""
Convert headers dictionary to raw HTTP format.
Args:
headers_dict (HeadersDictInput|None): Dictionary mapping header names to values
Returns:
bytes|None: Raw headers formatted for HTTP transmission,
or None if input is None
"""Usage Examples:
from w3lib.http import headers_raw_to_dict, headers_dict_to_raw
# Parse raw headers
raw = b"Content-Type: text/html\r\nAccept: gzip\r\nAccept: deflate"
headers = headers_raw_to_dict(raw)
# Returns: {b'Content-Type': [b'text/html'], b'Accept': [b'gzip', b'deflate']}
# Convert back to raw format
headers_dict = {b'Content-Type': b'text/html', b'Accept': [b'gzip', b'deflate']}
raw_headers = headers_dict_to_raw(headers_dict)
# Returns: b'Content-Type: text/html\r\nAccept: gzip\r\nAccept: deflate'
# Handle None input gracefully
headers_raw_to_dict(None) # Returns: None
headers_dict_to_raw(None) # Returns: NoneGenerate HTTP Basic Authentication header values according to RFC 2617.
def basic_auth_header(username, password, encoding='ISO-8859-1'):
"""
Generate HTTP Basic Authentication header value.
Args:
username (str|bytes): Username for authentication
password (str|bytes): Password for authentication
encoding (str): Character encoding for credentials (default: 'ISO-8859-1')
Returns:
bytes: Authorization header value formatted as 'Basic <base64-encoded-credentials>'
"""Usage Examples:
from w3lib.http import basic_auth_header
# Generate basic auth header
auth = basic_auth_header('user', 'password')
# Returns: b'Basic dXNlcjpwYXNzd29yZA=='
# Use in HTTP request
import requests
headers = {'Authorization': auth}
response = requests.get('https://api.example.com', headers=headers)
# Handle different encodings
auth_latin1 = basic_auth_header('user', 'password', encoding='ISO-8859-1')
auth_utf8 = basic_auth_header('user', 'password', encoding='UTF-8')# Input type for headers dictionary - flexible value types
HeadersDictInput = Mapping[bytes, Union[Any, Sequence[bytes]]]
# Output type for headers dictionary - normalized to lists
HeadersDictOutput = MutableMapping[bytes, list[bytes]]Raw headers are expected as byte strings with:
\r\n (CRLF): (colon followed by space)Header dictionaries use:
The basic_auth_header function:
: separatorBasic to create the complete header valueNone input by returning NoneInstall with Tessl CLI
npx tessl i tessl/pypi-w3libevals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10