Static memory-efficient and fast Trie-like structures for Python based on MARISA-trie C++ library.
npx @tessl/cli install tessl/pypi-marisa-trie@1.3.0Static memory-efficient and fast Trie-like structures for Python based on MARISA-trie C++ library. String data in a MARISA-trie may take up to 50x-100x less memory than in a standard Python dict while maintaining comparable lookup speed. The library provides fast advanced methods like prefix search, making it ideal for applications requiring efficient string storage, autocomplete functionality, and text processing.
pip install marisa-trieimport marisa_trieCommon imports for specific trie types:
from marisa_trie import Trie, BinaryTrie, BytesTrie, RecordTrieimport marisa_trie
# Create a Trie with unicode strings -> auto-generated IDs
trie = marisa_trie.Trie(['apple', 'application', 'apply', 'banana'])
# Check if keys exist
print('apple' in trie) # True
print('app' in trie) # False
# Get unique ID for a key
apple_id = trie.key_id('apple')
print(f"ID for 'apple': {apple_id}")
# Restore key from ID
restored_key = trie.restore_key(apple_id)
print(f"Key for ID {apple_id}: {restored_key}")
# Find all keys with prefix
prefix_keys = trie.keys(prefix='app')
print(f"Keys starting with 'app': {prefix_keys}")
# Find all prefixes of a key
prefixes = trie.prefixes('application')
print(f"Prefixes of 'application': {prefixes}")
# Save and load trie
trie.save('my_trie.dat')
new_trie = marisa_trie.Trie()
new_trie.load('my_trie.dat')MARISA-trie implements static, memory-efficient Trie data structures optimized for read-heavy workloads:
The library provides four specialized trie classes for different use cases:
Primary trie implementations for mapping keys to auto-generated unique IDs, supporting both unicode strings and binary data with comprehensive lookup and prefix search capabilities.
class Trie:
def __init__(self, arg=None, num_tries=DEFAULT_NUM_TRIES, binary=False,
cache_size=DEFAULT_CACHE, order=DEFAULT_ORDER, weights=None): ...
def key_id(self, key: str) -> int: ...
def restore_key(self, index: int) -> str: ...
def keys(self, prefix=None) -> list: ...
def prefixes(self, key: str) -> list: ...
class BinaryTrie:
def __init__(self, arg=None, **options): ...
def key_id(self, key: bytes) -> int: ...
def restore_key(self, index: int) -> bytes: ...
def prefixes(self, key: bytes) -> list: ...Advanced trie implementations that map keys to lists of custom data payloads, supporting bytes objects and structured data with automatic serialization.
class BytesTrie:
def __init__(self, arg=None, value_separator=b'\xff', **options): ...
def get(self, key, default=None) -> list: ...
def get_value(self, key: str) -> list: ...
def items(self, prefix="") -> list: ...
class RecordTrie:
def __init__(self, fmt: str, arg=None, **options): ...
def get(self, key, default=None) -> list: ...
def items(self, prefix="") -> list: ...Configuration constants for optimizing trie performance, memory usage, and behavior including cache sizes, node ordering, and storage options.
# Cache size constants
DEFAULT_CACHE: int
HUGE_CACHE: int
LARGE_CACHE: int
NORMAL_CACHE: int
SMALL_CACHE: int
TINY_CACHE: int
# Node ordering constants
LABEL_ORDER: int
WEIGHT_ORDER: int
DEFAULT_ORDER: int