0
# Hash Generation
1
2
Core perceptual hashing functions that analyze image structure and content to produce compact fingerprints. Each algorithm has different strengths and is optimized for specific types of image comparisons and transformations.
3
4
## Capabilities
5
6
### Average Hash
7
8
Computes hash based on average pixel luminance. Fast and effective for detecting basic transformations like scaling and format changes.
9
10
```python { .api }
11
def average_hash(image, hash_size=8, mean=numpy.mean):
12
"""
13
Average Hash computation following hackerfactor algorithm.
14
15
Args:
16
image (PIL.Image.Image): Input image to hash
17
hash_size (int): Hash size, must be >= 2 (default: 8)
18
mean (callable): Function to compute average luminescence (default: numpy.mean)
19
20
Returns:
21
ImageHash: Hash object representing the image
22
23
Raises:
24
ValueError: If hash_size < 2
25
"""
26
```
27
28
**Usage Example:**
29
30
```python
31
from PIL import Image
32
import imagehash
33
import numpy as np
34
35
image = Image.open('photo.jpg')
36
37
# Standard average hash
38
hash1 = imagehash.average_hash(image)
39
40
# Custom hash size for more precision
41
hash2 = imagehash.average_hash(image, hash_size=16)
42
43
# Using median instead of mean
44
hash3 = imagehash.average_hash(image, mean=np.median)
45
```
46
47
### Perceptual Hash (pHash)
48
49
Uses Discrete Cosine Transform (DCT) to analyze frequency domain. Robust to scaling, minor modifications, and gamma adjustments.
50
51
```python { .api }
52
def phash(image, hash_size=8, highfreq_factor=4):
53
"""
54
Perceptual Hash computation using DCT.
55
56
Args:
57
image (PIL.Image.Image): Input image to hash
58
hash_size (int): Hash size, must be >= 2 (default: 8)
59
highfreq_factor (int): High frequency scaling factor (default: 4)
60
61
Returns:
62
ImageHash: Hash object representing the image
63
64
Raises:
65
ValueError: If hash_size < 2
66
"""
67
```
68
69
### Simplified Perceptual Hash
70
71
Simplified version of perceptual hash with different DCT processing.
72
73
```python { .api }
74
def phash_simple(image, hash_size=8, highfreq_factor=4):
75
"""
76
Simplified Perceptual Hash computation.
77
78
Args:
79
image (PIL.Image.Image): Input image to hash
80
hash_size (int): Hash size (default: 8)
81
highfreq_factor (int): High frequency scaling factor (default: 4)
82
83
Returns:
84
ImageHash: Hash object representing the image
85
"""
86
```
87
88
### Difference Hash (dHash)
89
90
Computes differences between adjacent pixels. Sensitive to rotation but good for detecting structural changes.
91
92
```python { .api }
93
def dhash(image, hash_size=8):
94
"""
95
Difference Hash computation using horizontal pixel differences.
96
97
Args:
98
image (PIL.Image.Image): Input image to hash
99
hash_size (int): Hash size, must be >= 2 (default: 8)
100
101
Returns:
102
ImageHash: Hash object representing the image
103
104
Raises:
105
ValueError: If hash_size < 2
106
"""
107
```
108
109
### Vertical Difference Hash
110
111
Variant of difference hash that computes vertical pixel differences instead of horizontal.
112
113
```python { .api }
114
def dhash_vertical(image, hash_size=8):
115
"""
116
Difference Hash computation using vertical pixel differences.
117
118
Args:
119
image (PIL.Image.Image): Input image to hash
120
hash_size (int): Hash size (default: 8)
121
122
Returns:
123
ImageHash: Hash object representing the image
124
"""
125
```
126
127
### Wavelet Hash (wHash)
128
129
Uses wavelet transforms for frequency analysis. Configurable wavelet modes and scale parameters.
130
131
```python { .api }
132
def whash(image, hash_size=8, image_scale=None, mode='haar', remove_max_haar_ll=True):
133
"""
134
Wavelet Hash computation using PyWavelets.
135
136
Args:
137
image (PIL.Image.Image): Input image to hash
138
hash_size (int): Hash size, must be power of 2 (default: 8)
139
image_scale (int, optional): Image scale, must be power of 2. Auto-calculated if None
140
mode (str): Wavelet mode - 'haar' or 'db4' (default: 'haar')
141
remove_max_haar_ll (bool): Remove lowest frequency using Haar wavelet (default: True)
142
143
Returns:
144
ImageHash: Hash object representing the image
145
146
Raises:
147
AssertionError: If hash_size or image_scale are not powers of 2
148
AssertionError: If hash_size is in wrong range relative to image_scale
149
"""
150
```
151
152
**Usage Example:**
153
154
```python
155
# Standard wavelet hash with Haar wavelets
156
hash1 = imagehash.whash(image)
157
158
# Using Daubechies wavelets
159
hash2 = imagehash.whash(image, mode='db4')
160
161
# Custom hash size and image scale
162
hash3 = imagehash.whash(image, hash_size=16, image_scale=128)
163
164
# Disable low frequency removal
165
hash4 = imagehash.whash(image, remove_max_haar_ll=False)
166
```
167
168
### Color Hash
169
170
Analyzes color distribution in HSV space rather than structural features. Effective for detecting color-based similarity.
171
172
```python { .api }
173
def colorhash(image, binbits=3):
174
"""
175
Color Hash computation based on HSV color distribution.
176
177
Computes fractions of image in intensity, hue and saturation bins:
178
- First binbits encode black fraction of image
179
- Next binbits encode gray fraction (low saturation)
180
- Next 6*binbits encode highly saturated parts in 6 hue bins
181
- Next 6*binbits encode mildly saturated parts in 6 hue bins
182
183
Args:
184
image (PIL.Image.Image): Input image to hash
185
binbits (int): Number of bits for encoding pixel fractions (default: 3)
186
187
Returns:
188
ImageHash: Hash object representing the color distribution
189
"""
190
```
191
192
**Usage Example:**
193
194
```python
195
# Standard color hash
196
color_hash = imagehash.colorhash(image)
197
198
# Higher precision color analysis
199
detailed_hash = imagehash.colorhash(image, binbits=4)
200
201
# Compare color similarity regardless of structure
202
image1_color = imagehash.colorhash(image1)
203
image2_color = imagehash.colorhash(image2)
204
color_distance = image1_color - image2_color
205
```
206
207
## Algorithm Selection Guide
208
209
- **Average Hash**: Best for basic duplicate detection and format conversions
210
- **Perceptual Hash**: Ideal for detecting scaled or slightly modified images
211
- **Difference Hash**: Good for structural changes, sensitive to rotation
212
- **Wavelet Hash**: Configurable frequency analysis, good for detailed comparisons
213
- **Color Hash**: Focus on color similarity rather than structure
214
- **Crop-Resistant Hash**: When images may be cropped or partially occluded
215
216
## Performance Considerations
217
218
Hash computation performance (fastest to slowest):
219
1. Average Hash - Simple pixel averaging
220
2. Difference Hash - Pixel difference computation
221
3. Color Hash - HSV conversion and binning
222
4. Perceptual Hash - DCT transformation
223
5. Wavelet Hash - Wavelet decomposition
224
6. Crop-Resistant Hash - Image segmentation + multiple hashes
225
226
Choose the algorithm based on your specific use case requirements for speed vs. robustness.