CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-imagehash

Python library for perceptual image hashing with multiple algorithms including average, perceptual, difference, wavelet, color, and crop-resistant hashing

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

hash-conversion.mddocs/

Hash Conversion and Serialization

Functions for converting between hash objects and string representations, enabling hash storage, transmission, and restoration. Supports both single hashes and multi-hashes with compatibility for older formats.

Capabilities

Single Hash Conversion

Convert ImageHash objects to and from hexadecimal string representations for storage and transmission.

def hex_to_hash(hexstr):
    """
    Convert a stored hash (hex string) back to ImageHash object.
    
    Assumes hashes are bidimensional arrays with dimensions hash_size * hash_size,
    or onedimensional arrays with dimensions binbits * 14.
    
    Args:
        hexstr (str): Hexadecimal string representation of hash
    
    Returns:
        ImageHash: Restored hash object
    
    Note:
        Does not work for hash_size < 2
    """

Usage Example:

from PIL import Image
import imagehash

# Generate a hash
original_hash = imagehash.phash(Image.open('image.jpg'))
print(f"Original: {original_hash}")

# Convert to string for storage
hash_string = str(original_hash)
print(f"String: {hash_string}")

# Restore from string
restored_hash = imagehash.hex_to_hash(hash_string)
print(f"Restored: {restored_hash}")

# Verify they're identical
assert restored_hash == original_hash
assert str(restored_hash) == hash_string

Flat Hash Conversion

Specialized conversion for colorhash objects which use flat array representations.

def hex_to_flathash(hexstr, hashsize):
    """
    Convert hex string to ImageHash for flat array hashes (colorhash).
    
    Args:
        hexstr (str): Hexadecimal string representation
        hashsize (int): Hash size parameter used in original colorhash
    
    Returns:
        ImageHash: Restored hash object for colorhash
    """

Usage Example:

# Generate a color hash
original_hash = imagehash.colorhash(Image.open('image.jpg'), binbits=3)
hash_string = str(original_hash)

# Restore colorhash (note: need to specify binbits * 14 as hashsize)
hashsize = 3 * 14  # binbits * (2 + 12) - colorhash produces 14 segments
restored_hash = imagehash.hex_to_flathash(hash_string, hashsize)

assert restored_hash == original_hash

Multi-Hash Conversion

Convert ImageMultiHash objects (used in crop-resistant hashing) to and from string representations.

def hex_to_multihash(hexstr):
    """
    Convert a stored multihash (hex string) back to ImageMultiHash object.
    
    Based on hex_to_hash with same limitations:
    - Assumes bidimensional arrays with hash_size * hash_size dimensions
    - Does not work for hash_size < 2
    
    Args:
        hexstr (str): Comma-separated hex string representation
    
    Returns:
        ImageMultiHash: Restored multi-hash object
    """

Usage Example:

# Generate crop-resistant hash
original_hash = imagehash.crop_resistant_hash(
    Image.open('image.jpg'),
    min_segment_size=500,
    segmentation_image_size=1000
)

# Convert to string
hash_string = str(original_hash)
print(f"Multi-hash: {hash_string}")  # Comma-separated hashes

# Restore from string
restored_hash = imagehash.hex_to_multihash(hash_string)

assert restored_hash == original_hash
assert str(restored_hash) == hash_string

Legacy Format Support

Support for hash strings generated by ImageHash versions 3.7 and earlier.

def old_hex_to_hash(hexstr, hash_size=8):
    """
    Convert old format hash string to ImageHash object.
    
    For hashes generated by ImageHash up to version 3.7.
    For newer versions, use hex_to_hash instead.
    
    Args:
        hexstr (str): Hexadecimal string in old format
        hash_size (int): Hash size used when generating (default: 8)
    
    Returns:
        ImageHash: Restored hash object
    
    Raises:
        ValueError: If hex string size doesn't match expected count
    """

String Format Details

Single Hash Format

ImageHash objects convert to hexadecimal strings representing the binary hash array:

# Example hash string for 8x8 hash
"ffd7918181c9ffff"  # 16 hex characters = 64 bits = 8x8 hash

Multi-Hash Format

ImageMultiHash objects use comma-separated hash strings:

# Example multi-hash string
"ffd7918181c9ffff,7f7f7f7f7f7f7f7f,0f0f0f0f0f0f0f0f"  # 3 segment hashes

Color Hash Format

Color hashes use flat arrays with different bit arrangements:

# Color hash with binbits=3 has 14 * 3 = 42 bits
# Represented as 11 hex characters (44 bits, with 2 padding bits)
"7ff3e1c8764"

Practical Storage Examples

Database Storage

import sqlite3
from PIL import Image
import imagehash

# Setup database
conn = sqlite3.connect('image_hashes.db')
cursor = conn.cursor()
cursor.execute('''
    CREATE TABLE IF NOT EXISTS images (
        id INTEGER PRIMARY KEY,
        filename TEXT,
        ahash TEXT,
        phash TEXT,
        dhash TEXT
    )
''')

# Store hashes
image = Image.open('photo.jpg')
cursor.execute('''
    INSERT INTO images (filename, ahash, phash, dhash) 
    VALUES (?, ?, ?, ?)
''', (
    'photo.jpg',
    str(imagehash.average_hash(image)),
    str(imagehash.phash(image)),
    str(imagehash.dhash(image))
))

# Retrieve and restore hashes
cursor.execute('SELECT ahash, phash, dhash FROM images WHERE filename = ?', ('photo.jpg',))
ahash_str, phash_str, dhash_str = cursor.fetchone()

restored_ahash = imagehash.hex_to_hash(ahash_str)
restored_phash = imagehash.hex_to_hash(phash_str)
restored_dhash = imagehash.hex_to_hash(dhash_str)

JSON Serialization

import json
import imagehash
from PIL import Image

# Create hash data structure
image = Image.open('photo.jpg')
hash_data = {
    'filename': 'photo.jpg',
    'hashes': {
        'average': str(imagehash.average_hash(image)),
        'perceptual': str(imagehash.phash(image)),
        'difference': str(imagehash.dhash(image)),
        'crop_resistant': str(imagehash.crop_resistant_hash(image))
    }
}

# Save to JSON
with open('hashes.json', 'w') as f:
    json.dump(hash_data, f)

# Load and restore hashes
with open('hashes.json', 'r') as f:
    loaded_data = json.load(f)

restored_hashes = {
    'average': imagehash.hex_to_hash(loaded_data['hashes']['average']),
    'perceptual': imagehash.hex_to_hash(loaded_data['hashes']['perceptual']),
    'difference': imagehash.hex_to_hash(loaded_data['hashes']['difference']),
    'crop_resistant': imagehash.hex_to_multihash(loaded_data['hashes']['crop_resistant'])
}

Internal Functions

Binary Array to Hex Conversion

Internal utility function used by ImageHash classes for string conversion.

def _binary_array_to_hex(arr):
    """
    Internal function to convert binary array to hexadecimal string.
    
    Args:
        arr (NDArray): Boolean numpy array to convert
    
    Returns:
        str: Hexadecimal string representation
    
    Note:
        This is an internal function not intended for direct use.
        Use str(hash_object) instead for hash-to-string conversion.
    """

Version Compatibility

  • Current Format (4.0+): Use hex_to_hash(), hex_to_multihash(), hex_to_flathash()
  • Legacy Format (≤3.7): Use old_hex_to_hash() for compatibility
  • Breaking Change: Version 4.0 changed the binary-to-hex implementation due to bugs in earlier versions

Always specify which version was used to generate stored hashes to ensure correct restoration.

docs

core-classes.md

crop-resistant-hashing.md

hash-conversion.md

hash-generation.md

index.md

tile.json