0
# Crop-Resistant Hashing
1
2
Advanced hashing technique that provides resistance to image cropping by segmenting images into regions and hashing each segment individually. Based on the algorithm described in "Efficient Cropping-Resistant Robust Image Hashing" (DOI 10.1109/ARES.2014.85).
3
4
## Capabilities
5
6
### Crop-Resistant Hash Generation
7
8
Creates a multi-hash by partitioning the image into bright and dark segments using a watershed-like algorithm, then applying a hash function to each segment's bounding box.
9
10
```python { .api }
11
def crop_resistant_hash(
12
image,
13
hash_func=dhash,
14
limit_segments=None,
15
segment_threshold=128,
16
min_segment_size=500,
17
segmentation_image_size=300
18
):
19
"""
20
Creates crop-resistant hash using image segmentation.
21
22
This algorithm partitions the image into bright and dark segments using a
23
watershed-like algorithm, then hashes each segment. Provides resistance to
24
up to 50% cropping according to the research paper.
25
26
Args:
27
image (PIL.Image.Image): Input image to hash
28
hash_func (callable): Hash function to apply to segments (default: dhash)
29
limit_segments (int, optional): Limit to hashing only M largest segments
30
segment_threshold (int): Brightness threshold between hills and valleys (default: 128)
31
min_segment_size (int): Minimum pixels for a hashable segment (default: 500)
32
segmentation_image_size (int): Size for segmentation processing (default: 300)
33
34
Returns:
35
ImageMultiHash: Multi-hash object containing segment hashes
36
"""
37
```
38
39
**Usage Example:**
40
41
```python
42
from PIL import Image
43
import imagehash
44
45
# Load images
46
full_image = Image.open('full_photo.jpg')
47
cropped_image = Image.open('cropped_photo.jpg') # 30% cropped version
48
49
# Generate crop-resistant hashes
50
full_hash = imagehash.crop_resistant_hash(full_image)
51
crop_hash = imagehash.crop_resistant_hash(cropped_image)
52
53
# Check if images match despite cropping
54
matches = full_hash.matches(crop_hash, region_cutoff=1)
55
print(f"Images match: {matches}")
56
57
# Get detailed comparison metrics
58
num_matches, sum_distance = full_hash.hash_diff(crop_hash)
59
print(f"Matching segments: {num_matches}, Total distance: {sum_distance}")
60
61
# Calculate overall similarity score
62
similarity_score = full_hash - crop_hash
63
print(f"Similarity score: {similarity_score}")
64
```
65
66
### Advanced Configuration
67
68
**Custom Hash Functions:**
69
70
```python
71
# Use different hash algorithms for segments
72
ahash_segments = imagehash.crop_resistant_hash(image, imagehash.average_hash)
73
phash_segments = imagehash.crop_resistant_hash(image, imagehash.phash)
74
75
# Custom hash function with parameters
76
def custom_hash(img):
77
return imagehash.whash(img, mode='db4')
78
79
custom_segments = imagehash.crop_resistant_hash(image, custom_hash)
80
```
81
82
**Segmentation Parameters:**
83
84
```python
85
# High sensitivity segmentation (more segments)
86
fine_hash = imagehash.crop_resistant_hash(
87
image,
88
segment_threshold=64, # Lower threshold = more segments
89
min_segment_size=200, # Smaller minimum size
90
segmentation_image_size=500 # Higher resolution processing
91
)
92
93
# Coarse segmentation (fewer, larger segments)
94
coarse_hash = imagehash.crop_resistant_hash(
95
image,
96
segment_threshold=200, # Higher threshold = fewer segments
97
min_segment_size=1000, # Larger minimum size
98
limit_segments=5 # Only hash 5 largest segments
99
)
100
```
101
102
### Performance Optimization
103
104
**Segment Limiting:**
105
106
```python
107
# Limit to top 3 segments for faster processing and storage
108
limited_hash = imagehash.crop_resistant_hash(
109
image,
110
limit_segments=3,
111
min_segment_size=1000
112
)
113
```
114
115
**Processing Size Control:**
116
117
```python
118
# Balance between accuracy and speed
119
fast_hash = imagehash.crop_resistant_hash(
120
image,
121
segmentation_image_size=200 # Faster processing
122
)
123
124
accurate_hash = imagehash.crop_resistant_hash(
125
image,
126
segmentation_image_size=600 # More accurate segmentation
127
)
128
```
129
130
## Algorithm Details
131
132
The crop-resistant hashing process involves several steps:
133
134
1. **Image Preprocessing**: Convert to grayscale and resize to segmentation size
135
2. **Filtering**: Apply Gaussian blur and median filter to reduce noise
136
3. **Thresholding**: Separate pixels into "hills" (bright) and "valleys" (dark)
137
4. **Region Growing**: Use watershed-like algorithm to find connected regions
138
5. **Segment Filtering**: Remove segments smaller than minimum size
139
6. **Bounding Box Creation**: Create bounding boxes for each segment in original image
140
7. **Individual Hashing**: Apply hash function to each segment's bounding box
141
8. **Multi-Hash Assembly**: Combine individual hashes into ImageMultiHash object
142
143
## Comparison and Matching
144
145
The ImageMultiHash class provides several methods for comparing crop-resistant hashes:
146
147
- **Exact Matching**: `hash1 == hash2` or `hash1.matches(hash2)`
148
- **Flexible Matching**: Configure region cutoff and hamming distance thresholds
149
- **Distance Scoring**: `hash1 - hash2` returns similarity score
150
- **Best Match**: Find closest match from a list of candidates
151
152
## Internal Functions
153
154
The crop-resistant hashing algorithm uses several internal functions for image segmentation:
155
156
```python { .api }
157
def _find_region(remaining_pixels, segmented_pixels):
158
"""
159
Internal function to find connected regions in segmented image.
160
161
Args:
162
remaining_pixels (NDArray): Boolean array of unsegmented pixels
163
segmented_pixels (set): Set of already segmented pixel coordinates
164
165
Returns:
166
set: Set of pixel coordinates forming a connected region
167
"""
168
169
def _find_all_segments(pixels, segment_threshold, min_segment_size):
170
"""
171
Internal function to find all segments in an image.
172
173
Args:
174
pixels (NDArray): Grayscale pixel array
175
segment_threshold (int): Brightness threshold for segmentation
176
min_segment_size (int): Minimum pixels required for a segment
177
178
Returns:
179
list: List of segments, each segment is a set of pixel coordinates
180
"""
181
```
182
183
## Limitations and Considerations
184
185
- **Processing Time**: Significantly slower than basic hash functions due to segmentation
186
- **Memory Usage**: Stores multiple hash objects (one per segment)
187
- **Parameter Sensitivity**: Segmentation parameters affect matching performance
188
- **Version Compatibility**: Results may vary slightly between Pillow versions due to grayscale conversion changes
189
190
## Use Cases
191
192
Crop-resistant hashing is ideal for:
193
- **Reverse Image Search**: Finding images even when cropped or partially visible
194
- **Social Media Monitoring**: Detecting shared images with various crops/frames
195
- **Copyright Detection**: Identifying copyrighted content despite cropping
196
- **Duplicate Detection**: Finding similar images with different aspect ratios
197
- **Content Moderation**: Detecting prohibited content regardless of cropping