CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-plover

Open Source Stenography Software providing real-time stenographic typing, machine support, and plugin architecture.

Pending
Overview
Eval results
Files

dictionaries.mddocs/

Dictionary System

Plover's dictionary system provides powerful translation management with support for multiple formats, hierarchical precedence, filtering, and real-time updates. It enables efficient lookup of stenographic translations and supports both individual dictionaries and collections with sophisticated precedence rules.

Capabilities

Individual Dictionary Management

Base dictionary class providing core translation storage and lookup capabilities with support for various file formats.

class StenoDictionary:
    """Base class for stenographic dictionaries."""
    
    readonly: bool = False
    """Class attribute indicating if dictionary format is read-only."""
    
    def __init__(self):
        """
        Initialize empty dictionary.
        
        Creates dictionary instance ready for loading or manual population.
        """
    
    @classmethod
    def create(cls, resource: str) -> 'StenoDictionary':
        """
        Create new dictionary at specified location.
        
        Args:
            resource: Path or resource identifier for new dictionary
            
        Returns:
            New StenoDictionary instance
            
        Creates empty dictionary file and returns dictionary instance.
        """
    
    @classmethod
    def load(cls, resource: str) -> 'StenoDictionary':
        """
        Load existing dictionary from file.
        
        Args:
            resource: Path or resource identifier for dictionary
            
        Returns:
            Loaded StenoDictionary instance
            
        Raises:
            FileNotFoundError: If dictionary file doesn't exist
            ValueError: If dictionary format is invalid
        """
    
    def save(self) -> None:
        """
        Save dictionary to file.
        
        Writes current dictionary contents to associated file,
        creating backup if necessary.
        
        Raises:
            PermissionError: If dictionary is readonly
            IOError: If file cannot be written
        """
    
    @property
    def longest_key(self) -> int:
        """
        Get length of longest stroke sequence.
        
        Returns:
            Maximum number of strokes in any dictionary entry
            
        Used for optimizing translation processing.
        """
    
    def clear(self) -> None:
        """
        Remove all entries from dictionary.
        
        Clears dictionary contents but retains file association.
        """
    
    def items(self) -> list:
        """
        Get all dictionary entries.
        
        Returns:
            List of (strokes_tuple, translation) pairs
            
        Provides access to complete dictionary contents.
        """
    
    def update(self, *args, **kwargs) -> None:
        """
        Update dictionary with new entries.
        
        Args:
            *args: Dictionary or iterable of (key, value) pairs
            **kwargs: Keyword arguments as key-value pairs
            
        Adds or updates multiple entries efficiently.
        """
    
    def get(self, key: tuple, fallback=None):
        """
        Get translation with fallback value.
        
        Args:
            key: Tuple of stroke strings to look up
            fallback: Value to return if key not found
            
        Returns:
            Translation string or fallback value
        """
    
    def reverse_lookup(self, value: str) -> list:
        """
        Find stroke sequences for translation.
        
        Args:
            value: Translation text to search for
            
        Returns:
            List of stroke tuples that produce the translation
            
        Searches all entries to find matching translations.
        """
    
    def casereverse_lookup(self, value: str) -> list:
        """
        Case-insensitive reverse lookup.
        
        Args:
            value: Translation text to search for (case-insensitive)
            
        Returns:
            List of stroke tuples for case-insensitive matches
        """

Dictionary Interface Methods

Standard dictionary-like interface for direct access to translations.

def __len__(self) -> int:
    """
    Get number of entries in dictionary.
    
    Returns:
        Count of translation entries
    """

def __iter__(self):
    """
    Iterate over stroke sequences.
    
    Yields:
        Stroke tuples for all entries
    """

def __getitem__(self, key: tuple) -> str:
    """
    Get translation for stroke sequence.
    
    Args:
        key: Tuple of stroke strings
        
    Returns:
        Translation string
        
    Raises:
        KeyError: If stroke sequence not found
    """

def __setitem__(self, key: tuple, value: str) -> None:
    """
    Set translation for stroke sequence.
    
    Args:
        key: Tuple of stroke strings
        value: Translation text
        
    Adds or updates dictionary entry.
    """

def __delitem__(self, key: tuple) -> None:
    """
    Delete translation entry.
    
    Args:
        key: Tuple of stroke strings to remove
        
    Raises:
        KeyError: If stroke sequence not found
    """

def __contains__(self, key: tuple) -> bool:
    """
    Check if stroke sequence exists.
    
    Args:
        key: Tuple of stroke strings to check
        
    Returns:
        True if stroke sequence has translation
    """

Dictionary Collection Management

Collection class managing multiple dictionaries with precedence rules and filtering capabilities.

class StenoDictionaryCollection:
    """Collection of dictionaries with precedence and filtering."""
    
    def __init__(self, dicts: list = []):
        """
        Initialize dictionary collection.
        
        Args:
            dicts: List of StenoDictionary instances in precedence order
            
        Higher precedence dictionaries appear earlier in list.
        """
    
    @property
    def longest_key(self) -> int:
        """
        Get longest key across all dictionaries.
        
        Returns:
            Maximum stroke sequence length across collection
        """
    
    def set_dicts(self, dicts: list) -> None:
        """
        Set dictionary list with precedence order.
        
        Args:
            dicts: List of StenoDictionary instances
            
        Replaces current dictionary collection.
        """
    
    def lookup(self, key: tuple) -> str:
        """
        Look up translation with precedence and filters.
        
        Args:
            key: Tuple of stroke strings
            
        Returns:
            Translation from highest precedence dictionary
            
        Searches dictionaries in order, applying filters.
        """
    
    def raw_lookup(self, key: tuple) -> str:
        """
        Look up translation without filters.
        
        Args:
            key: Tuple of stroke strings
            
        Returns:
            Raw translation from highest precedence dictionary
            
        Bypasses all dictionary filters.
        """
    
    def lookup_from_all(self, key: tuple) -> list:
        """
        Look up from all dictionaries.
        
        Args:
            key: Tuple of stroke strings
            
        Returns:
            List of (dictionary_path, translation) tuples
            
        Returns matches from all dictionaries regardless of precedence.
        """
    
    def raw_lookup_from_all(self, key: tuple) -> list:
        """
        Raw lookup from all dictionaries.
        
        Args:
            key: Tuple of stroke strings
            
        Returns:
            List of (dictionary_path, translation) tuples
            
        Bypasses filters and returns all matches.
        """
    
    def reverse_lookup(self, value: str) -> list:
        """
        Reverse lookup across all dictionaries.
        
        Args:
            value: Translation text to find strokes for
            
        Returns:
            List of stroke tuples from all dictionaries
        """
    
    def casereverse_lookup(self, value: str) -> list:
        """
        Case-insensitive reverse lookup across all dictionaries.
        
        Args:
            value: Translation text (case-insensitive)
            
        Returns:
            List of stroke tuples for case-insensitive matches
        """
    
    def first_writable(self) -> StenoDictionary:
        """
        Get first writable dictionary in collection.
        
        Returns:
            First dictionary that is not readonly
            
        Raises:
            ValueError: If no writable dictionaries available
            
        Used for adding new translations.
        """
    
    def set(self, key: tuple, value: str, path: str = None) -> None:
        """
        Set translation in specified or first writable dictionary.
        
        Args:
            key: Tuple of stroke strings
            value: Translation text
            path: Specific dictionary path, uses first writable if None
            
        Adds translation to specified dictionary or first writable.
        """
    
    def save(self, path_list: list = None) -> None:
        """
        Save dictionaries to files.
        
        Args:
            path_list: List of paths to save, saves all if None
            
        Saves specified dictionaries or all writable dictionaries.
        """
    
    def get(self, path: str) -> StenoDictionary:
        """
        Get dictionary by file path.
        
        Args:
            path: File path of dictionary to retrieve
            
        Returns:
            StenoDictionary instance for specified path
            
        Raises:
            KeyError: If dictionary not found
        """
    
    def __getitem__(self, path: str) -> StenoDictionary:
        """
        Get dictionary by path using subscript notation.
        
        Args:
            path: File path of dictionary
            
        Returns:
            StenoDictionary instance
        """
    
    def __iter__(self):
        """
        Iterate over all dictionaries.
        
        Yields:
            StenoDictionary instances in precedence order
        """

Dictionary Filtering

Filter system for modifying dictionary lookup behavior with custom logic.

def add_filter(self, f) -> None:
    """
    Add dictionary filter function.
    
    Args:
        f: Filter function taking (strokes, translation) -> bool
        
    Filter functions can modify or reject translations during lookup.
    """

def remove_filter(self, f) -> None:
    """
    Remove dictionary filter function.
    
    Args:
        f: Filter function to remove
        
    Removes previously added filter from the processing chain.
    """

Dictionary Loading Functions

Utility functions for creating and loading dictionaries with format detection.

def create_dictionary(resource: str, threaded_save: bool = True) -> StenoDictionary:
    """
    Create new dictionary with format detection.
    
    Args:
        resource: Path or resource identifier for dictionary
        threaded_save: Whether to use threaded saving for performance
        
    Returns:
        New StenoDictionary instance of appropriate format
        
    Detects format from file extension and creates appropriate dictionary type.
    """

def load_dictionary(resource: str, threaded_save: bool = True) -> StenoDictionary:
    """
    Load dictionary with automatic format detection.
    
    Args:
        resource: Path or resource identifier for dictionary
        threaded_save: Whether to use threaded saving for performance
        
    Returns:
        Loaded StenoDictionary instance of detected format
        
    Automatically determines format and creates appropriate dictionary instance.
    """

Supported Dictionary Formats

JSON Dictionary Format

Standard JSON format with stroke tuples as keys and translations as values.

File Extension: .json
Format: {"STROKE/SEQUENCE": "translation"} Characteristics: Human-readable, easily editable, full Unicode support

RTF/CRE Dictionary Format

Rich Text Format adapted for stenographic dictionaries.

File Extension: .rtf Format: RTF document with embedded stenographic data Characteristics: Compatible with commercial stenography software

Usage Examples

from plover.steno_dictionary import StenoDictionary, StenoDictionaryCollection
from plover.dictionary.base import create_dictionary, load_dictionary

# Create new dictionary
new_dict = create_dictionary('/path/to/new_dict.json')
new_dict[('H', 'E', 'L', 'O')] = 'hello'
new_dict.save()

# Load existing dictionary
existing_dict = load_dictionary('/path/to/existing_dict.json')
translation = existing_dict[('W', 'O', 'R', 'L', 'D')]

# Work with dictionary collection
dict1 = load_dictionary('/path/to/main.json')
dict2 = load_dictionary('/path/to/user.json') 
collection = StenoDictionaryCollection([dict1, dict2])

# Look up translations
translation = collection.lookup(('T', 'E', 'S', 'T'))
all_matches = collection.lookup_from_all(('T', 'E', 'S', 'T'))

# Reverse lookup
strokes = collection.reverse_lookup('hello')
# Result: [('H', 'E', 'L', 'O'), ('H', 'E', 'L', '*')]

# Add new translation
collection.set(('K', 'U', 'S', 'T', 'O', 'M'), 'custom')

# Add dictionary filter
def filter_short_translations(strokes, translation):
    return len(translation) > 2

collection.add_filter(filter_short_translations)

# Dictionary operations
print(f"Dictionary has {len(dict1)} entries")
print(f"Longest stroke sequence: {dict1.longest_key}")

# Iterate over entries
for strokes, translation in dict1.items():
    print(f"{'/'.join(strokes)} -> {translation}")

# Check for entries
if ('T', 'E', 'S', 'T') in dict1:
    print("Test entry exists")

# Update multiple entries
dict1.update({
    ('O', 'N', 'E'): 'one',
    ('T', 'W', 'O'): 'two', 
    ('T', 'H', 'R', 'E', 'E'): 'three'
})

# Save changes
dict1.save()
collection.save()  # Saves all writable dictionaries

Dictionary Precedence

In dictionary collections, precedence determines which translation is returned when multiple dictionaries contain the same stroke sequence:

  1. First Match Wins: The first dictionary in the collection list that contains a translation wins
  2. User Dictionaries First: Typically user dictionaries are placed before system dictionaries
  3. Specific Before General: More specific dictionaries should precede general ones
# Precedence example
main_dict = load_dictionary('main.json')      # Contains: TEST -> "test"
user_dict = load_dictionary('user.json')     # Contains: TEST -> "examination"

# User dictionary first = user translation wins
collection = StenoDictionaryCollection([user_dict, main_dict])
result = collection.lookup(('T', 'E', 'S', 'T'))  # Returns "examination"

# Main dictionary first = main translation wins  
collection = StenoDictionaryCollection([main_dict, user_dict])
result = collection.lookup(('T', 'E', 'S', 'T'))  # Returns "test"

Dictionary Filtering

Filters allow modification of dictionary behavior without changing dictionary files:

def uppercase_filter(strokes, translation):
    """Convert all translations to uppercase."""
    return translation.upper()

def length_filter(strokes, translation):
    """Only allow translations longer than 3 characters."""
    return translation if len(translation) > 3 else None

def stroke_count_filter(strokes, translation):
    """Only allow single-stroke entries."""
    return translation if len(strokes) == 1 else None

collection.add_filter(uppercase_filter)
collection.add_filter(length_filter)

Types

from typing import Dict, List, Tuple, Optional, Union, Callable, Any
from pathlib import Path

StrokeSequence = Tuple[str, ...]
Translation = str
DictionaryEntry = Tuple[StrokeSequence, Translation]
DictionaryItems = List[DictionaryEntry]

DictionaryPath = Union[str, Path]
DictionaryResource = Union[str, Path]

FilterFunction = Callable[[StrokeSequence, Translation], Optional[Translation]]
FilterList = List[FilterFunction]

LookupResult = Optional[Translation]
LookupResults = List[Tuple[DictionaryPath, Translation]]
ReverseLookupResults = List[StrokeSequence]

DictionaryList = List[StenoDictionary]
DictionaryDict = Dict[DictionaryPath, StenoDictionary]

Install with Tessl CLI

npx tessl i tessl/pypi-plover

docs

configuration.md

dictionaries.md

engine.md

extensions.md

index.md

machines.md

registry.md

steno-data.md

tile.json