or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

core-classes.mdcrop-resistant-hashing.mdhash-conversion.mdhash-generation.mdindex.md
tile.json

tessl/pypi-imagehash

Python library for perceptual image hashing with multiple algorithms including average, perceptual, difference, wavelet, color, and crop-resistant hashing

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/imagehash@4.3.x

To install, run

npx @tessl/cli install tessl/pypi-imagehash@4.3.0

index.mddocs/

ImageHash

A comprehensive Python library for perceptual image hashing that provides multiple hashing algorithms including average hashing, perceptual hashing, difference hashing, wavelet hashing, HSV color hashing, and crop-resistant hashing. Unlike cryptographic hashes, these perceptual hashes are designed to produce similar outputs for visually similar images, making them ideal for image deduplication, similarity detection, and reverse image search applications.

Package Information

  • Package Name: imagehash
  • Language: Python
  • Installation: pip install imagehash
  • Dependencies: numpy, scipy, pillow, PyWavelets

Core Imports

import imagehash

Working with PIL/Pillow Image objects:

from PIL import Image
import imagehash

Basic Usage

from PIL import Image
import imagehash

# Load images
image1 = Image.open('image1.jpg')
image2 = Image.open('image2.jpg')

# Generate hashes using different algorithms
ahash = imagehash.average_hash(image1)
phash = imagehash.phash(image1)
dhash = imagehash.dhash(image1)

# Compare images by calculating Hamming distance
distance = ahash - imagehash.average_hash(image2)
print(f"Hamming distance: {distance}")

# Check if images are similar (distance of 0 means identical hashes)
similar = distance < 10  # threshold depends on your needs

# Convert hash to string for storage
hash_string = str(ahash)
print(f"Hash: {hash_string}")

# Restore hash from string
restored_hash = imagehash.hex_to_hash(hash_string)
assert restored_hash == ahash

Architecture

ImageHash provides two main classes for hash representation:

  • ImageHash: Encapsulates single perceptual hashes with comparison operations
  • ImageMultiHash: Container for multiple hashes used in crop-resistant hashing

The library supports multiple perceptual hashing algorithms, each with different strengths:

  • Average Hash: Fast, good for detecting basic transformations
  • Perceptual Hash: Uses DCT, robust to scaling and minor modifications
  • Difference Hash: Tracks gradient changes, sensitive to rotation
  • Wavelet Hash: Uses wavelets, configurable frequency analysis
  • Color Hash: Analyzes color distribution rather than structure
  • Crop-Resistant Hash: Segments image for crop tolerance

All hash functions accept PIL/Pillow Image objects and return ImageHash objects that support comparison operations and string serialization.

Capabilities

Hash Generation

Core perceptual hashing functions including average, perceptual, difference, wavelet, and color hashing algorithms. Each algorithm has different strengths for various image comparison scenarios.

def average_hash(image, hash_size=8, mean=numpy.mean): ...
def phash(image, hash_size=8, highfreq_factor=4): ...
def phash_simple(image, hash_size=8, highfreq_factor=4): ...
def dhash(image, hash_size=8): ...
def dhash_vertical(image, hash_size=8): ...
def whash(image, hash_size=8, image_scale=None, mode='haar', remove_max_haar_ll=True): ...
def colorhash(image, binbits=3): ...

Hash Generation

Crop-Resistant Hashing

Advanced hashing technique that segments images into regions to provide resistance to cropping. Uses watershed-like algorithm to partition images into bright and dark segments, then hashes each segment individually.

def crop_resistant_hash(
    image,
    hash_func=dhash,
    limit_segments=None,
    segment_threshold=128,
    min_segment_size=500,
    segmentation_image_size=300
): ...

Crop-Resistant Hashing

Hash Conversion and Serialization

Functions for converting between hash objects and string representations, supporting both single hashes and multi-hashes. Includes compatibility functions for older hash formats.

def hex_to_hash(hexstr): ...
def hex_to_flathash(hexstr, hashsize): ...
def hex_to_multihash(hexstr): ...
def old_hex_to_hash(hexstr, hash_size=8): ...

Hash Conversion

Core Classes

Hash container classes that provide comparison operations, string conversion, and mathematical operations for computing similarity between images.

class ImageHash:
    def __init__(self, binary_array): ...
    def __sub__(self, other): ...  # Hamming distance
    def __eq__(self, other): ...   # Equality comparison
    # ... other methods

class ImageMultiHash:
    def __init__(self, hashes): ...
    def matches(self, other_hash, region_cutoff=1, hamming_cutoff=None, bit_error_rate=None): ...
    def best_match(self, other_hashes, hamming_cutoff=None, bit_error_rate=None): ...
    # ... other methods

Core Classes

Types

# Type aliases for better type hints
NDArray = numpy.typing.NDArray[numpy.bool_]  # Boolean numpy array
WhashMode = Literal['haar', 'db4']           # Wavelet modes
MeanFunc = Callable[[NDArray], float]       # Mean function type
HashFunc = Callable[[Image.Image], ImageHash]  # Hash function type

Constants

__version__ = '4.3.2'  # Library version
ANTIALIAS = Image.Resampling.LANCZOS  # PIL resampling method

Utilities

Command-Line Image Similarity Tool

The package includes a command-line utility script find_similar_images.py for finding similar images in directories.

def find_similar_images(userpaths, hashfunc=imagehash.average_hash):
    """
    Find similar images in specified directories using various hashing algorithms.
    
    Args:
        userpaths: List of directory paths to scan for images
        hashfunc: Hash function to use (default: average_hash)
    """

Command-line usage:

# Find similar images using average hash
python find_similar_images.py ahash /path/to/images

# Available algorithms:
# ahash          - Average hash
# phash          - Perceptual hash  
# dhash          - Difference hash
# whash-haar     - Haar wavelet hash
# whash-db4      - Daubechies wavelet hash
# colorhash      - HSV color hash
# crop-resistant - Crop-resistant hash