0
# xxhash
1
2
Python binding for the xxHash library providing fast non-cryptographic hash algorithms. xxhash offers hashlib-compliant interfaces for xxh32, xxh64, and xxh3 hash functions with both streaming and oneshot APIs, designed for applications requiring fast checksums, data integrity verification, hash tables, and general-purpose hashing where cryptographic security is not required.
3
4
## Package Information
5
6
- **Package Name**: xxhash
7
- **Language**: Python
8
- **Installation**: `pip install xxhash` or `conda install -c conda-forge python-xxhash`
9
10
## Core Imports
11
12
```python
13
import xxhash
14
```
15
16
For specific algorithms:
17
18
```python
19
from xxhash import xxh32, xxh64, xxh3_64, xxh3_128
20
```
21
22
## Basic Usage
23
24
```python
25
import xxhash
26
27
# Using streaming interface (hashlib-compliant)
28
hasher = xxhash.xxh64()
29
hasher.update(b'Hello')
30
hasher.update(b' World')
31
print(hasher.hexdigest()) # '32dd38952c4bc720'
32
33
# Using oneshot functions (more efficient)
34
hash_value = xxhash.xxh64_hexdigest(b'Hello World')
35
print(hash_value) # '32dd38952c4bc720'
36
37
# With seed for different hash values
38
hash_with_seed = xxhash.xxh64_hexdigest(b'Hello World', seed=42)
39
print(hash_with_seed) # 'different hash value'
40
41
# Different output formats
42
print(xxhash.xxh64_digest(b'test')) # b'\xd4\x12...` (bytes)
43
print(xxhash.xxh64_hexdigest(b'test')) # 'd412...' (hex string)
44
print(xxhash.xxh64_intdigest(b'test')) # 15266911421115075350 (integer)
45
```
46
47
## Capabilities
48
49
### Streaming Hash Classes
50
51
Hashlib-compliant streaming hash interfaces that allow incremental data processing.
52
53
```python { .api }
54
from typing_extensions import final
55
56
@final
57
class xxh32:
58
"""32-bit xxHash streaming hasher."""
59
def __init__(self, input: InputType = None, seed: int = 0) -> None: ...
60
def update(self, input: InputType) -> None: ...
61
def digest(self) -> bytes: ...
62
def hexdigest(self) -> str: ...
63
def intdigest(self) -> int: ...
64
def copy(self) -> xxh32: ...
65
def reset(self) -> None: ...
66
67
@property
68
def digest_size(self) -> int: ... # Returns 4
69
@property
70
def digestsize(self) -> int: ... # Alias for digest_size
71
@property
72
def block_size(self) -> int: ... # Returns 16
73
@property
74
def name(self) -> str: ... # Returns "xxh32"
75
@property
76
def seed(self) -> int: ... # Seed value used
77
78
@final
79
class xxh3_64:
80
"""64-bit XXH3 streaming hasher."""
81
def __init__(self, input: InputType = None, seed: int = 0) -> None: ...
82
def update(self, input: InputType) -> None: ...
83
def digest(self) -> bytes: ...
84
def hexdigest(self) -> str: ...
85
def intdigest(self) -> int: ...
86
def copy(self) -> xxh3_64: ...
87
def reset(self) -> None: ...
88
89
@property
90
def digest_size(self) -> int: ... # Returns 8
91
@property
92
def digestsize(self) -> int: ... # Alias for digest_size
93
@property
94
def block_size(self) -> int: ... # Returns 16
95
@property
96
def name(self) -> str: ... # Returns "xxh3_64"
97
@property
98
def seed(self) -> int: ... # Seed value used
99
100
@final
101
class xxh3_128:
102
"""128-bit XXH3 streaming hasher."""
103
def __init__(self, input: InputType = None, seed: int = 0) -> None: ...
104
def update(self, input: InputType) -> None: ...
105
def digest(self) -> bytes: ...
106
def hexdigest(self) -> str: ...
107
def intdigest(self) -> int: ...
108
def copy(self) -> xxh3_64: ...
109
def reset(self) -> None: ...
110
111
@property
112
def digest_size(self) -> int: ... # Returns 8
113
@property
114
def digestsize(self) -> int: ... # Alias for digest_size
115
@property
116
def block_size(self) -> int: ... # Returns 16
117
@property
118
def name(self) -> str: ... # Returns "xxh3_64"
119
@property
120
def seed(self) -> int: ... # Seed value used
121
122
class xxh3_128:
123
"""128-bit XXH3 streaming hasher."""
124
def __init__(self, input: InputType = None, seed: int = 0) -> None: ...
125
def update(self, input: InputType) -> None: ...
126
def digest(self) -> bytes: ...
127
def hexdigest(self) -> str: ...
128
def intdigest(self) -> int: ...
129
def copy(self) -> xxh3_128: ...
130
def reset(self) -> None: ...
131
132
@property
133
def digest_size(self) -> int: ... # Returns 16
134
@property
135
def digestsize(self) -> int: ... # Alias for digest_size
136
@property
137
def block_size(self) -> int: ... # Returns 16
138
@property
139
def name(self) -> str: ... # Returns "xxh3_128"
140
@property
141
def seed(self) -> int: ... # Seed value used
142
143
# Aliases pointing to the main classes
144
xxh64 = xxh3_64 # xxh64 is an alias for xxh3_64
145
xxh128 = xxh3_128 # xxh128 is an alias for xxh3_128
146
```
147
148
### Oneshot Hash Functions
149
150
Fast oneshot functions that avoid heap allocation, optimized for single-use hashing.
151
152
```python { .api }
153
# xxh32 oneshot functions
154
def xxh32_digest(args: InputType, seed: int = 0) -> bytes:
155
"""Return xxh32 hash as bytes (big-endian)."""
156
157
def xxh32_hexdigest(args: InputType, seed: int = 0) -> str:
158
"""Return xxh32 hash as lowercase hex string."""
159
160
def xxh32_intdigest(args: InputType, seed: int = 0) -> int:
161
"""Return xxh32 hash as integer."""
162
163
# xxh64 oneshot functions (aliases for xxh3_64)
164
def xxh64_digest(args: InputType, seed: int = 0) -> bytes:
165
"""Return xxh64 hash as bytes (big-endian)."""
166
167
def xxh64_hexdigest(args: InputType, seed: int = 0) -> str:
168
"""Return xxh64 hash as lowercase hex string."""
169
170
def xxh64_intdigest(args: InputType, seed: int = 0) -> int:
171
"""Return xxh64 hash as integer."""
172
173
# xxh3_64 oneshot functions
174
def xxh3_64_digest(args: InputType, seed: int = 0) -> bytes:
175
"""Return xxh3_64 hash as bytes (big-endian)."""
176
177
def xxh3_64_hexdigest(args: InputType, seed: int = 0) -> str:
178
"""Return xxh3_64 hash as lowercase hex string."""
179
180
def xxh3_64_intdigest(args: InputType, seed: int = 0) -> int:
181
"""Return xxh3_64 hash as integer."""
182
183
# xxh3_128 oneshot functions
184
def xxh3_128_digest(args: InputType, seed: int = 0) -> bytes:
185
"""Return xxh3_128 hash as bytes (big-endian)."""
186
187
def xxh3_128_hexdigest(args: InputType, seed: int = 0) -> str:
188
"""Return xxh3_128 hash as lowercase hex string."""
189
190
def xxh3_128_intdigest(args: InputType, seed: int = 0) -> int:
191
"""Return xxh3_128 hash as integer."""
192
193
# xxh128 oneshot functions (aliases for xxh3_128)
194
def xxh128_digest(args: InputType, seed: int = 0) -> bytes:
195
"""Return xxh128 hash as bytes (big-endian)."""
196
197
def xxh128_hexdigest(args: InputType, seed: int = 0) -> str:
198
"""Return xxh128 hash as lowercase hex string."""
199
200
def xxh128_intdigest(args: InputType, seed: int = 0) -> int:
201
"""Return xxh128 hash as integer."""
202
```
203
204
### Package Metadata
205
206
Version information and available algorithms.
207
208
```python { .api }
209
VERSION: str # Package version (e.g., "3.5.0")
210
VERSION_TUPLE: tuple[int, ...] # Version as tuple (e.g., (3, 5, 0))
211
XXHASH_VERSION: str # Underlying xxHash library version
212
algorithms_available: set[str] # {"xxh32", "xxh64", "xxh3_64", "xxh128", "xxh3_128"}
213
```
214
215
## Types
216
217
```python { .api }
218
from typing import Union
219
import array
220
221
# Input types accepted by all hash functions
222
InputType = Union[str, bytes, bytearray, memoryview, array.ArrayType[int]]
223
```
224
225
## Usage Examples
226
227
### Streaming vs Oneshot Performance
228
229
```python
230
import xxhash
231
232
# Streaming interface - use when data comes in chunks
233
hasher = xxhash.xxh64()
234
with open('large_file.dat', 'rb') as f:
235
for chunk in iter(lambda: f.read(8192), b""):
236
hasher.update(chunk)
237
result = hasher.hexdigest()
238
239
# Oneshot interface - faster for single operations
240
result = xxhash.xxh64_hexdigest(open('small_file.dat', 'rb').read())
241
```
242
243
### Different Hash Algorithms
244
245
```python
246
import xxhash
247
248
data = b"test data"
249
250
# 32-bit hash (4 bytes output)
251
hash32 = xxhash.xxh32_hexdigest(data)
252
253
# 64-bit hash (8 bytes output) - recommended for general use
254
hash64 = xxhash.xxh64_hexdigest(data)
255
256
# 128-bit hash (16 bytes output) - for applications needing lower collision rate
257
hash128 = xxhash.xxh128_hexdigest(data)
258
```
259
260
### Seed Usage for Different Hash Families
261
262
```python
263
import xxhash
264
265
data = b"same input data"
266
267
# Different seeds produce different hashes
268
hash_a = xxhash.xxh64_hexdigest(data, seed=0)
269
hash_b = xxhash.xxh64_hexdigest(data, seed=42)
270
hash_c = xxhash.xxh64_hexdigest(data, seed=12345)
271
272
# All three hashes will be different despite same input
273
```
274
275
### Output Format Comparison
276
277
```python
278
import xxhash
279
280
data = b"example"
281
hasher = xxhash.xxh64(data)
282
283
# Three output formats for the same hash
284
bytes_result = hasher.digest() # b'\xd4\x12...' (8 bytes for xxh64)
285
hex_result = hasher.hexdigest() # 'd412...' (16 hex chars for xxh64)
286
int_result = hasher.intdigest() # 15266911421115075350 (integer)
287
288
# Verify they represent the same value
289
assert int_result.to_bytes(8, 'big') == bytes_result
290
assert format(int_result, '016x') == hex_result
291
```
292
293
## Important Notes
294
295
### Seed Constraints
296
297
- **xxh32**: Accepts unsigned 32-bit integers (0 to 2^32-1)
298
- **xxh64/xxh3_64**: Accepts unsigned 64-bit integers (0 to 2^64-1)
299
- **xxh3_128**: Accepts unsigned 64-bit integers (0 to 2^64-1)
300
301
### Output Format
302
303
All `digest()` methods return **big-endian** byte representation (changed from little-endian in version 0.3.0).
304
305
### Performance
306
307
Oneshot functions (`xxh64_hexdigest`, etc.) are faster than streaming equivalents for single-use hashing as they avoid heap allocation.
308
309
### Non-Cryptographic
310
311
xxhash algorithms are **not cryptographically secure** and should not be used for security purposes, password hashing, or HMAC applications.