or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-classes.mdcrop-resistant-hashing.mdhash-conversion.mdhash-generation.mdindex.md

crop-resistant-hashing.mddocs/

0

# Crop-Resistant Hashing

1

2

Advanced hashing technique that provides resistance to image cropping by segmenting images into regions and hashing each segment individually. Based on the algorithm described in "Efficient Cropping-Resistant Robust Image Hashing" (DOI 10.1109/ARES.2014.85).

3

4

## Capabilities

5

6

### Crop-Resistant Hash Generation

7

8

Creates a multi-hash by partitioning the image into bright and dark segments using a watershed-like algorithm, then applying a hash function to each segment's bounding box.

9

10

```python { .api }

11

def crop_resistant_hash(

12

image,

13

hash_func=dhash,

14

limit_segments=None,

15

segment_threshold=128,

16

min_segment_size=500,

17

segmentation_image_size=300

18

):

19

"""

20

Creates crop-resistant hash using image segmentation.

21

22

This algorithm partitions the image into bright and dark segments using a

23

watershed-like algorithm, then hashes each segment. Provides resistance to

24

up to 50% cropping according to the research paper.

25

26

Args:

27

image (PIL.Image.Image): Input image to hash

28

hash_func (callable): Hash function to apply to segments (default: dhash)

29

limit_segments (int, optional): Limit to hashing only M largest segments

30

segment_threshold (int): Brightness threshold between hills and valleys (default: 128)

31

min_segment_size (int): Minimum pixels for a hashable segment (default: 500)

32

segmentation_image_size (int): Size for segmentation processing (default: 300)

33

34

Returns:

35

ImageMultiHash: Multi-hash object containing segment hashes

36

"""

37

```

38

39

**Usage Example:**

40

41

```python

42

from PIL import Image

43

import imagehash

44

45

# Load images

46

full_image = Image.open('full_photo.jpg')

47

cropped_image = Image.open('cropped_photo.jpg') # 30% cropped version

48

49

# Generate crop-resistant hashes

50

full_hash = imagehash.crop_resistant_hash(full_image)

51

crop_hash = imagehash.crop_resistant_hash(cropped_image)

52

53

# Check if images match despite cropping

54

matches = full_hash.matches(crop_hash, region_cutoff=1)

55

print(f"Images match: {matches}")

56

57

# Get detailed comparison metrics

58

num_matches, sum_distance = full_hash.hash_diff(crop_hash)

59

print(f"Matching segments: {num_matches}, Total distance: {sum_distance}")

60

61

# Calculate overall similarity score

62

similarity_score = full_hash - crop_hash

63

print(f"Similarity score: {similarity_score}")

64

```

65

66

### Advanced Configuration

67

68

**Custom Hash Functions:**

69

70

```python

71

# Use different hash algorithms for segments

72

ahash_segments = imagehash.crop_resistant_hash(image, imagehash.average_hash)

73

phash_segments = imagehash.crop_resistant_hash(image, imagehash.phash)

74

75

# Custom hash function with parameters

76

def custom_hash(img):

77

return imagehash.whash(img, mode='db4')

78

79

custom_segments = imagehash.crop_resistant_hash(image, custom_hash)

80

```

81

82

**Segmentation Parameters:**

83

84

```python

85

# High sensitivity segmentation (more segments)

86

fine_hash = imagehash.crop_resistant_hash(

87

image,

88

segment_threshold=64, # Lower threshold = more segments

89

min_segment_size=200, # Smaller minimum size

90

segmentation_image_size=500 # Higher resolution processing

91

)

92

93

# Coarse segmentation (fewer, larger segments)

94

coarse_hash = imagehash.crop_resistant_hash(

95

image,

96

segment_threshold=200, # Higher threshold = fewer segments

97

min_segment_size=1000, # Larger minimum size

98

limit_segments=5 # Only hash 5 largest segments

99

)

100

```

101

102

### Performance Optimization

103

104

**Segment Limiting:**

105

106

```python

107

# Limit to top 3 segments for faster processing and storage

108

limited_hash = imagehash.crop_resistant_hash(

109

image,

110

limit_segments=3,

111

min_segment_size=1000

112

)

113

```

114

115

**Processing Size Control:**

116

117

```python

118

# Balance between accuracy and speed

119

fast_hash = imagehash.crop_resistant_hash(

120

image,

121

segmentation_image_size=200 # Faster processing

122

)

123

124

accurate_hash = imagehash.crop_resistant_hash(

125

image,

126

segmentation_image_size=600 # More accurate segmentation

127

)

128

```

129

130

## Algorithm Details

131

132

The crop-resistant hashing process involves several steps:

133

134

1. **Image Preprocessing**: Convert to grayscale and resize to segmentation size

135

2. **Filtering**: Apply Gaussian blur and median filter to reduce noise

136

3. **Thresholding**: Separate pixels into "hills" (bright) and "valleys" (dark)

137

4. **Region Growing**: Use watershed-like algorithm to find connected regions

138

5. **Segment Filtering**: Remove segments smaller than minimum size

139

6. **Bounding Box Creation**: Create bounding boxes for each segment in original image

140

7. **Individual Hashing**: Apply hash function to each segment's bounding box

141

8. **Multi-Hash Assembly**: Combine individual hashes into ImageMultiHash object

142

143

## Comparison and Matching

144

145

The ImageMultiHash class provides several methods for comparing crop-resistant hashes:

146

147

- **Exact Matching**: `hash1 == hash2` or `hash1.matches(hash2)`

148

- **Flexible Matching**: Configure region cutoff and hamming distance thresholds

149

- **Distance Scoring**: `hash1 - hash2` returns similarity score

150

- **Best Match**: Find closest match from a list of candidates

151

152

## Internal Functions

153

154

The crop-resistant hashing algorithm uses several internal functions for image segmentation:

155

156

```python { .api }

157

def _find_region(remaining_pixels, segmented_pixels):

158

"""

159

Internal function to find connected regions in segmented image.

160

161

Args:

162

remaining_pixels (NDArray): Boolean array of unsegmented pixels

163

segmented_pixels (set): Set of already segmented pixel coordinates

164

165

Returns:

166

set: Set of pixel coordinates forming a connected region

167

"""

168

169

def _find_all_segments(pixels, segment_threshold, min_segment_size):

170

"""

171

Internal function to find all segments in an image.

172

173

Args:

174

pixels (NDArray): Grayscale pixel array

175

segment_threshold (int): Brightness threshold for segmentation

176

min_segment_size (int): Minimum pixels required for a segment

177

178

Returns:

179

list: List of segments, each segment is a set of pixel coordinates

180

"""

181

```

182

183

## Limitations and Considerations

184

185

- **Processing Time**: Significantly slower than basic hash functions due to segmentation

186

- **Memory Usage**: Stores multiple hash objects (one per segment)

187

- **Parameter Sensitivity**: Segmentation parameters affect matching performance

188

- **Version Compatibility**: Results may vary slightly between Pillow versions due to grayscale conversion changes

189

190

## Use Cases

191

192

Crop-resistant hashing is ideal for:

193

- **Reverse Image Search**: Finding images even when cropped or partially visible

194

- **Social Media Monitoring**: Detecting shared images with various crops/frames

195

- **Copyright Detection**: Identifying copyrighted content despite cropping

196

- **Duplicate Detection**: Finding similar images with different aspect ratios

197

- **Content Moderation**: Detecting prohibited content regardless of cropping