Python library for perceptual image hashing with multiple algorithms including average, perceptual, difference, wavelet, color, and crop-resistant hashing
—
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Pending
The risk profile of this skill
Advanced hashing technique that provides resistance to image cropping by segmenting images into regions and hashing each segment individually. Based on the algorithm described in "Efficient Cropping-Resistant Robust Image Hashing" (DOI 10.1109/ARES.2014.85).
Creates a multi-hash by partitioning the image into bright and dark segments using a watershed-like algorithm, then applying a hash function to each segment's bounding box.
def crop_resistant_hash(
image,
hash_func=dhash,
limit_segments=None,
segment_threshold=128,
min_segment_size=500,
segmentation_image_size=300
):
"""
Creates crop-resistant hash using image segmentation.
This algorithm partitions the image into bright and dark segments using a
watershed-like algorithm, then hashes each segment. Provides resistance to
up to 50% cropping according to the research paper.
Args:
image (PIL.Image.Image): Input image to hash
hash_func (callable): Hash function to apply to segments (default: dhash)
limit_segments (int, optional): Limit to hashing only M largest segments
segment_threshold (int): Brightness threshold between hills and valleys (default: 128)
min_segment_size (int): Minimum pixels for a hashable segment (default: 500)
segmentation_image_size (int): Size for segmentation processing (default: 300)
Returns:
ImageMultiHash: Multi-hash object containing segment hashes
"""Usage Example:
from PIL import Image
import imagehash
# Load images
full_image = Image.open('full_photo.jpg')
cropped_image = Image.open('cropped_photo.jpg') # 30% cropped version
# Generate crop-resistant hashes
full_hash = imagehash.crop_resistant_hash(full_image)
crop_hash = imagehash.crop_resistant_hash(cropped_image)
# Check if images match despite cropping
matches = full_hash.matches(crop_hash, region_cutoff=1)
print(f"Images match: {matches}")
# Get detailed comparison metrics
num_matches, sum_distance = full_hash.hash_diff(crop_hash)
print(f"Matching segments: {num_matches}, Total distance: {sum_distance}")
# Calculate overall similarity score
similarity_score = full_hash - crop_hash
print(f"Similarity score: {similarity_score}")Custom Hash Functions:
# Use different hash algorithms for segments
ahash_segments = imagehash.crop_resistant_hash(image, imagehash.average_hash)
phash_segments = imagehash.crop_resistant_hash(image, imagehash.phash)
# Custom hash function with parameters
def custom_hash(img):
return imagehash.whash(img, mode='db4')
custom_segments = imagehash.crop_resistant_hash(image, custom_hash)Segmentation Parameters:
# High sensitivity segmentation (more segments)
fine_hash = imagehash.crop_resistant_hash(
image,
segment_threshold=64, # Lower threshold = more segments
min_segment_size=200, # Smaller minimum size
segmentation_image_size=500 # Higher resolution processing
)
# Coarse segmentation (fewer, larger segments)
coarse_hash = imagehash.crop_resistant_hash(
image,
segment_threshold=200, # Higher threshold = fewer segments
min_segment_size=1000, # Larger minimum size
limit_segments=5 # Only hash 5 largest segments
)Segment Limiting:
# Limit to top 3 segments for faster processing and storage
limited_hash = imagehash.crop_resistant_hash(
image,
limit_segments=3,
min_segment_size=1000
)Processing Size Control:
# Balance between accuracy and speed
fast_hash = imagehash.crop_resistant_hash(
image,
segmentation_image_size=200 # Faster processing
)
accurate_hash = imagehash.crop_resistant_hash(
image,
segmentation_image_size=600 # More accurate segmentation
)The crop-resistant hashing process involves several steps:
The ImageMultiHash class provides several methods for comparing crop-resistant hashes:
hash1 == hash2 or hash1.matches(hash2)hash1 - hash2 returns similarity scoreThe crop-resistant hashing algorithm uses several internal functions for image segmentation:
def _find_region(remaining_pixels, segmented_pixels):
"""
Internal function to find connected regions in segmented image.
Args:
remaining_pixels (NDArray): Boolean array of unsegmented pixels
segmented_pixels (set): Set of already segmented pixel coordinates
Returns:
set: Set of pixel coordinates forming a connected region
"""
def _find_all_segments(pixels, segment_threshold, min_segment_size):
"""
Internal function to find all segments in an image.
Args:
pixels (NDArray): Grayscale pixel array
segment_threshold (int): Brightness threshold for segmentation
min_segment_size (int): Minimum pixels required for a segment
Returns:
list: List of segments, each segment is a set of pixel coordinates
"""Crop-resistant hashing is ideal for: