or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

hashers.mdindex.mdsimple-functions.md
tile.json

tessl/pypi-mmh3

Python extension for MurmurHash (MurmurHash3), a set of fast and robust hash functions.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/mmh3@4.1.x

To install, run

npx @tessl/cli install tessl/pypi-mmh3@4.1.0

index.mddocs/

mmh3

mmh3 is a Python extension providing MurmurHash3 hash functions, a family of fast and robust non-cryptographic hash functions. It offers comprehensive hashing capabilities including 32-bit, 64-bit, and 128-bit hash functions with both signed and unsigned outputs, supporting architecture-specific optimizations for x86 and x64 platforms.

Package Information

  • Package Name: mmh3
  • Language: Python
  • Installation: pip install mmh3

Core Imports

import mmh3

All functions and classes are available directly from the mmh3 module.

Basic Usage

import mmh3

# Basic 32-bit hashing
hash_value = mmh3.hash("foo")  # -156908512
hash_with_seed = mmh3.hash("foo", seed=42)  # -1322301282
unsigned_hash = mmh3.hash("foo", signed=False)  # 4138058784

# 64-bit hashing (returns tuple of two 64-bit integers)
hash64_result = mmh3.hash64("foo")  # (-2129773440516405919, 9128664383759220103)

# 128-bit hashing
hash128_result = mmh3.hash128("foo", seed=42)  # 215966891540331383248189432718888555506

# Hash as bytes
hash_bytes = mmh3.hash_bytes("foo")  # b'aE\xf5\x01W\x86q\xe2\x87}\xba+\xe4\x87\xaf~'

# Streaming hasher for large data
hasher = mmh3.mmh3_32(seed=42)
hasher.update(b"foo")
hasher.update(b"bar")
digest = hasher.digest()  # bytes
sint_digest = hasher.sintdigest()  # signed int
uint_digest = hasher.uintdigest()  # unsigned int

Architecture

mmh3 provides two complementary interfaces:

  • Simple Functions: Direct hash computation for immediate results
  • Hasher Classes: Streaming interface for incremental hashing of large datasets

The library implements MurmurHash3 algorithms with architecture-specific optimizations:

  • x64 optimization: For 64-bit architectures (default)
  • x86 optimization: For 32-bit architectures

All hash functions support configurable seeds and signed/unsigned output options.

Capabilities

Simple Hash Functions

Direct hash computation functions for immediate results with various output formats and architecture optimizations.

def hash(key: StrHashable, seed: int = 0, signed: bool = True) -> int: ...
def hash_from_buffer(key: StrHashable, seed: int = 0, signed: bool = True) -> int: ...
def hash64(key: StrHashable, seed: int = 0, x64arch: bool = True, signed: bool = True) -> tuple[int, int]: ...
def hash128(key: StrHashable, seed: int = 0, x64arch: bool = True, signed: bool = False) -> int: ...
def hash_bytes(key: StrHashable, seed: int = 0, x64arch: bool = True) -> bytes: ...

Simple Hash Functions

Streaming Hashers

hashlib-compatible hasher classes for incremental hashing of large datasets and streaming operations.

class Hasher:
    def __init__(self, seed: int = 0) -> None: ...
    def update(self, input: Hashable) -> None: ...
    def digest(self) -> bytes: ...
    def sintdigest(self) -> int: ...
    def uintdigest(self) -> int: ...
    def copy(self) -> Hasher: ...
    @property
    def digest_size(self) -> int: ...
    @property
    def block_size(self) -> int: ...
    @property
    def name(self) -> str: ...

class mmh3_32(Hasher): ...

class mmh3_x64_128(Hasher):
    def stupledigest(self) -> tuple[int, int]: ...
    def utupledigest(self) -> tuple[int, int]: ...

class mmh3_x86_128(Hasher):
    def stupledigest(self) -> tuple[int, int]: ...
    def utupledigest(self) -> tuple[int, int]: ...

Streaming Hashers

Types

from typing import Protocol, Union

class IntArrayLike(Protocol):
    def __getitem__(self, index) -> int: ...

Hashable = Union[bytes, bytearray, memoryview, IntArrayLike]
StrHashable = Union[str, Hashable]

Common Use Cases

  • Data Mining & Machine Learning: Feature hashing and dimensionality reduction
  • Bloom Filters: Fast set membership testing with probabilistic data structures
  • MinHash Algorithms: Document similarity and near-duplicate detection
  • Natural Language Processing: Text fingerprinting and similarity matching
  • IoT Security Research: Shodan favicon hash calculations
  • Distributed Systems: Consistent hashing and data partitioning
  • Caching: Cache key generation with collision resistance