or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-imagehash

Python library for perceptual image hashing with multiple algorithms including average, perceptual, difference, wavelet, color, and crop-resistant hashing

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/imagehash@4.3.x

To install, run

npx @tessl/cli install tessl/pypi-imagehash@4.3.0

0

# ImageHash

1

2

A comprehensive Python library for perceptual image hashing that provides multiple hashing algorithms including average hashing, perceptual hashing, difference hashing, wavelet hashing, HSV color hashing, and crop-resistant hashing. Unlike cryptographic hashes, these perceptual hashes are designed to produce similar outputs for visually similar images, making them ideal for image deduplication, similarity detection, and reverse image search applications.

3

4

## Package Information

5

6

- **Package Name**: imagehash

7

- **Language**: Python

8

- **Installation**: `pip install imagehash`

9

- **Dependencies**: numpy, scipy, pillow, PyWavelets

10

11

## Core Imports

12

13

```python

14

import imagehash

15

```

16

17

Working with PIL/Pillow Image objects:

18

19

```python

20

from PIL import Image

21

import imagehash

22

```

23

24

## Basic Usage

25

26

```python

27

from PIL import Image

28

import imagehash

29

30

# Load images

31

image1 = Image.open('image1.jpg')

32

image2 = Image.open('image2.jpg')

33

34

# Generate hashes using different algorithms

35

ahash = imagehash.average_hash(image1)

36

phash = imagehash.phash(image1)

37

dhash = imagehash.dhash(image1)

38

39

# Compare images by calculating Hamming distance

40

distance = ahash - imagehash.average_hash(image2)

41

print(f"Hamming distance: {distance}")

42

43

# Check if images are similar (distance of 0 means identical hashes)

44

similar = distance < 10 # threshold depends on your needs

45

46

# Convert hash to string for storage

47

hash_string = str(ahash)

48

print(f"Hash: {hash_string}")

49

50

# Restore hash from string

51

restored_hash = imagehash.hex_to_hash(hash_string)

52

assert restored_hash == ahash

53

```

54

55

## Architecture

56

57

ImageHash provides two main classes for hash representation:

58

59

- **ImageHash**: Encapsulates single perceptual hashes with comparison operations

60

- **ImageMultiHash**: Container for multiple hashes used in crop-resistant hashing

61

62

The library supports multiple perceptual hashing algorithms, each with different strengths:

63

- **Average Hash**: Fast, good for detecting basic transformations

64

- **Perceptual Hash**: Uses DCT, robust to scaling and minor modifications

65

- **Difference Hash**: Tracks gradient changes, sensitive to rotation

66

- **Wavelet Hash**: Uses wavelets, configurable frequency analysis

67

- **Color Hash**: Analyzes color distribution rather than structure

68

- **Crop-Resistant Hash**: Segments image for crop tolerance

69

70

All hash functions accept PIL/Pillow Image objects and return ImageHash objects that support comparison operations and string serialization.

71

72

## Capabilities

73

74

### Hash Generation

75

76

Core perceptual hashing functions including average, perceptual, difference, wavelet, and color hashing algorithms. Each algorithm has different strengths for various image comparison scenarios.

77

78

```python { .api }

79

def average_hash(image, hash_size=8, mean=numpy.mean): ...

80

def phash(image, hash_size=8, highfreq_factor=4): ...

81

def phash_simple(image, hash_size=8, highfreq_factor=4): ...

82

def dhash(image, hash_size=8): ...

83

def dhash_vertical(image, hash_size=8): ...

84

def whash(image, hash_size=8, image_scale=None, mode='haar', remove_max_haar_ll=True): ...

85

def colorhash(image, binbits=3): ...

86

```

87

88

[Hash Generation](./hash-generation.md)

89

90

### Crop-Resistant Hashing

91

92

Advanced hashing technique that segments images into regions to provide resistance to cropping. Uses watershed-like algorithm to partition images into bright and dark segments, then hashes each segment individually.

93

94

```python { .api }

95

def crop_resistant_hash(

96

image,

97

hash_func=dhash,

98

limit_segments=None,

99

segment_threshold=128,

100

min_segment_size=500,

101

segmentation_image_size=300

102

): ...

103

```

104

105

[Crop-Resistant Hashing](./crop-resistant-hashing.md)

106

107

### Hash Conversion and Serialization

108

109

Functions for converting between hash objects and string representations, supporting both single hashes and multi-hashes. Includes compatibility functions for older hash formats.

110

111

```python { .api }

112

def hex_to_hash(hexstr): ...

113

def hex_to_flathash(hexstr, hashsize): ...

114

def hex_to_multihash(hexstr): ...

115

def old_hex_to_hash(hexstr, hash_size=8): ...

116

```

117

118

[Hash Conversion](./hash-conversion.md)

119

120

### Core Classes

121

122

Hash container classes that provide comparison operations, string conversion, and mathematical operations for computing similarity between images.

123

124

```python { .api }

125

class ImageHash:

126

def __init__(self, binary_array): ...

127

def __sub__(self, other): ... # Hamming distance

128

def __eq__(self, other): ... # Equality comparison

129

# ... other methods

130

131

class ImageMultiHash:

132

def __init__(self, hashes): ...

133

def matches(self, other_hash, region_cutoff=1, hamming_cutoff=None, bit_error_rate=None): ...

134

def best_match(self, other_hashes, hamming_cutoff=None, bit_error_rate=None): ...

135

# ... other methods

136

```

137

138

[Core Classes](./core-classes.md)

139

140

## Types

141

142

```python { .api }

143

# Type aliases for better type hints

144

NDArray = numpy.typing.NDArray[numpy.bool_] # Boolean numpy array

145

WhashMode = Literal['haar', 'db4'] # Wavelet modes

146

MeanFunc = Callable[[NDArray], float] # Mean function type

147

HashFunc = Callable[[Image.Image], ImageHash] # Hash function type

148

```

149

150

## Constants

151

152

```python { .api }

153

__version__ = '4.3.2' # Library version

154

ANTIALIAS = Image.Resampling.LANCZOS # PIL resampling method

155

```

156

157

## Utilities

158

159

### Command-Line Image Similarity Tool

160

161

The package includes a command-line utility script `find_similar_images.py` for finding similar images in directories.

162

163

```python { .api }

164

def find_similar_images(userpaths, hashfunc=imagehash.average_hash):

165

"""

166

Find similar images in specified directories using various hashing algorithms.

167

168

Args:

169

userpaths: List of directory paths to scan for images

170

hashfunc: Hash function to use (default: average_hash)

171

"""

172

```

173

174

**Command-line usage:**

175

176

```bash

177

# Find similar images using average hash

178

python find_similar_images.py ahash /path/to/images

179

180

# Available algorithms:

181

# ahash - Average hash

182

# phash - Perceptual hash

183

# dhash - Difference hash

184

# whash-haar - Haar wavelet hash

185

# whash-db4 - Daubechies wavelet hash

186

# colorhash - HSV color hash

187

# crop-resistant - Crop-resistant hash

188

```