0
# Simple Hash Functions
1
2
Direct hash computation functions that provide immediate results with various output formats and architecture optimizations. These functions are ideal for one-time hashing operations where you don't need streaming capabilities.
3
4
## Capabilities
5
6
### 32-bit Hash Function
7
8
Computes a 32-bit MurmurHash3 hash value from the input data.
9
10
```python { .api }
11
def hash(key: StrHashable, seed: int = 0, signed: bool = True) -> int:
12
"""
13
Compute 32-bit MurmurHash3 hash.
14
15
Args:
16
key: Input data to hash (str, bytes, bytearray, memoryview, or array-like)
17
seed: Seed value for hash computation (default: 0)
18
signed: Return signed integer if True, unsigned if False (default: True)
19
20
Returns:
21
32-bit hash value as signed or unsigned integer
22
"""
23
```
24
25
Example usage:
26
```python
27
import mmh3
28
29
# Basic hashing
30
result = mmh3.hash("foo") # -156908512
31
32
# With custom seed
33
result = mmh3.hash("foo", seed=42) # -1322301282
34
35
# Unsigned output
36
result = mmh3.hash("foo", signed=False) # 4138058784
37
38
# Hash bytes
39
result = mmh3.hash(b"hello") # hash bytes directly
40
41
# Hash array-like objects
42
import numpy as np
43
arr = np.array([1, 2, 3, 4], dtype=np.int8)
44
result = mmh3.hash(arr)
45
```
46
47
### Buffer-based 32-bit Hash
48
49
Computes 32-bit hash without memory copying, optimized for large memory views and arrays.
50
51
```python { .api }
52
def hash_from_buffer(key: StrHashable, seed: int = 0, signed: bool = True) -> int:
53
"""
54
Compute 32-bit MurmurHash3 hash without memory copying.
55
56
Args:
57
key: Input data to hash (str, bytes, bytearray, memoryview, or array-like)
58
seed: Seed value for hash computation (default: 0)
59
signed: Return signed integer if True, unsigned if False (default: True)
60
61
Returns:
62
32-bit hash value as signed or unsigned integer
63
"""
64
```
65
66
Example usage:
67
```python
68
import mmh3
69
import numpy as np
70
71
# Efficient hashing of large arrays
72
large_array = np.random.rand(1000000)
73
result = mmh3.hash_from_buffer(large_array) # -2137204694
74
75
# Memory-efficient hashing
76
memview = memoryview(b"large data chunk" * 10000)
77
result = mmh3.hash_from_buffer(memview, signed=False) # 3812874078
78
```
79
80
### 64-bit Hash Function
81
82
Computes 64-bit hash using the 128-bit algorithm backend, returning two 64-bit integers.
83
84
```python { .api }
85
def hash64(key: StrHashable, seed: int = 0, x64arch: bool = True, signed: bool = True) -> tuple[int, int]:
86
"""
87
Compute 64-bit MurmurHash3 hash using 128-bit algorithm.
88
89
Args:
90
key: Input data to hash (str, bytes, bytearray, memoryview, or array-like)
91
seed: Seed value for hash computation (default: 0)
92
x64arch: Use x64 optimization if True, x86 if False (default: True)
93
signed: Return signed integers if True, unsigned if False (default: True)
94
95
Returns:
96
Tuple of two 64-bit hash values as signed or unsigned integers
97
"""
98
```
99
100
Example usage:
101
```python
102
import mmh3
103
104
# Basic 64-bit hashing
105
result = mmh3.hash64("foo") # (-2129773440516405919, 9128664383759220103)
106
107
# Unsigned 64-bit hash
108
result = mmh3.hash64("foo", signed=False) # (16316970633193145697, 9128664383759220103)
109
110
# With x86 architecture optimization
111
result = mmh3.hash64("foo", x64arch=False) # Different result optimized for x86
112
113
# With custom seed and architecture
114
result = mmh3.hash64("foo", seed=42, x64arch=True) # (-840311307571801102, -6739155424061121879)
115
```
116
117
### 128-bit Hash Function
118
119
Computes a 128-bit MurmurHash3 hash value returned as a single large integer.
120
121
```python { .api }
122
def hash128(key: StrHashable, seed: int = 0, x64arch: bool = True, signed: bool = False) -> int:
123
"""
124
Compute 128-bit MurmurHash3 hash.
125
126
Args:
127
key: Input data to hash (str, bytes, bytearray, memoryview, or array-like)
128
seed: Seed value for hash computation (default: 0)
129
x64arch: Use x64 optimization if True, x86 if False (default: True)
130
signed: Return signed integer if True, unsigned if False (default: False)
131
132
Returns:
133
128-bit hash value as signed or unsigned integer
134
"""
135
```
136
137
Example usage:
138
```python
139
import mmh3
140
141
# Basic 128-bit hashing (unsigned by default)
142
result = mmh3.hash128("foo") # Large 128-bit unsigned integer
143
144
# With custom seed
145
result = mmh3.hash128("foo", seed=42) # 215966891540331383248189432718888555506
146
147
# Signed 128-bit hash
148
result = mmh3.hash128("foo", seed=42, signed=True) # -124315475380607080215185174712879655950
149
150
# x86 architecture optimization
151
result = mmh3.hash128("foo", x64arch=False) # Optimized for x86
152
```
153
154
### Hash as Bytes
155
156
Computes 128-bit hash and returns the result as raw bytes.
157
158
```python { .api }
159
def hash_bytes(key: StrHashable, seed: int = 0, x64arch: bool = True) -> bytes:
160
"""
161
Compute 128-bit MurmurHash3 hash returned as bytes.
162
163
Args:
164
key: Input data to hash (str, bytes, bytearray, memoryview, or array-like)
165
seed: Seed value for hash computation (default: 0)
166
x64arch: Use x64 optimization if True, x86 if False (default: True)
167
168
Returns:
169
128-bit hash value as 16-byte bytes object
170
"""
171
```
172
173
Example usage:
174
```python
175
import mmh3
176
177
# Hash as bytes
178
result = mmh3.hash_bytes("foo") # b'aE\xf5\x01W\x86q\xe2\x87}\xba+\xe4\x87\xaf~'
179
180
# With custom seed
181
result = mmh3.hash_bytes("foo", seed=42) # 16 bytes
182
183
# Convert to hex string if needed
184
hex_result = mmh3.hash_bytes("foo").hex() # '6145f501578671e2877dba2be487af7e'
185
186
# Hash large numpy arrays efficiently
187
import numpy as np
188
large_array = np.zeros(2**20, dtype=np.int8) # 1MB array
189
result = mmh3.hash_bytes(large_array) # b'V\x8f}\xad\x8eNM\xa84\x07FU\x9c\xc4\xcc\x8e'
190
```
191
192
## Architecture Optimization
193
194
The `x64arch` parameter in `hash64`, `hash128`, and `hash_bytes` functions controls algorithm optimization:
195
196
- **`x64arch=True`** (default): Optimized for 64-bit architectures
197
- **`x64arch=False`**: Optimized for 32-bit architectures
198
199
Choose the appropriate setting based on your target platform for optimal performance.
200
201
## Input Type Support
202
203
All functions accept these input types:
204
205
- **`str`**: Unicode strings (automatically encoded to UTF-8)
206
- **`bytes`**: Raw byte data
207
- **`bytearray`**: Mutable byte arrays
208
- **`memoryview`**: Memory views for zero-copy operations
209
- **Array-like objects**: NumPy arrays, lists with integer indexing
210
211
## Seed Values
212
213
- Seeds must be 32-bit integers (0 to 2^32 - 1)
214
- Negative seeds are automatically converted to unsigned 32-bit representation
215
- Seeds exceeding 32-bit range may produce unexpected results
216
217
## Error Handling
218
219
Functions raise `TypeError` for invalid input types and handle memory allocation failures gracefully. All functions are thread-safe and can be used in concurrent environments.