CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-plover

Open Source Stenography Software providing real-time stenographic typing, machine support, and plugin architecture.

Pending
Overview
Eval results
Files

steno-data.mddocs/

Stenographic Data Models

Plover's stenographic data system provides core data structures for representing stenographic strokes, translations, and formatting. It includes comprehensive support for stroke normalization, validation, conversion between formats, and integration with various stenotype systems.

Capabilities

Stroke Representation

Primary data structure for representing individual stenographic strokes with support for various input formats and normalization.

class Stroke:
    """Primary stenographic stroke representation."""
    
    PREFIX_STROKE: 'Stroke' = None
    """Special prefix stroke for system initialization."""
    
    UNDO_STROKE: 'Stroke' = None  
    """Special undo stroke for correction operations."""
    
    @classmethod
    def setup(cls, keys: tuple, implicit_hyphen_keys: frozenset, 
              number_key: str, numbers: dict, feral_number_key: str, 
              undo_stroke: str) -> None:
        """
        Setup stroke system with stenotype system parameters.
        
        Args:
            keys: Available stenographic keys in order
            implicit_hyphen_keys: Keys that imply hyphen placement
            number_key: Key used for number mode
            numbers: Mapping of keys to numbers
            feral_number_key: Alternative number key
            undo_stroke: Stroke pattern for undo operation
            
        Configures global stroke processing for specific stenotype system.
        """
    
    @classmethod
    def from_steno(cls, steno: str) -> 'Stroke':
        """
        Create stroke from steno notation string.
        
        Args:
            steno: Stenographic notation (e.g., 'STKPW')
            
        Returns:
            Stroke instance representing the notation
            
        Parses various steno notation formats into normalized stroke.
        """
    
    @classmethod  
    def from_keys(cls, keys: set) -> 'Stroke':
        """
        Create stroke from set of pressed keys.
        
        Args:
            keys: Set of key strings that were pressed
            
        Returns:
            Stroke instance for the key combination
            
        Converts raw key presses into stenographic stroke.
        """
    
    @classmethod
    def from_integer(cls, integer: int) -> 'Stroke':
        """  
        Create stroke from integer representation.
        
        Args:
            integer: Integer with bits representing pressed keys
            
        Returns:
            Stroke instance for the bit pattern
            
        Converts bit-packed stroke data into stroke object.
        """
    
    @classmethod
    def normalize_stroke(cls, steno: str, strict: bool = True) -> str:
        """
        Normalize stroke notation to standard format.
        
        Args:
            steno: Stroke notation to normalize
            strict: Whether to enforce strict validation
            
        Returns:
            Normalized stroke notation string
            
        Raises:
            ValueError: If stroke notation is invalid and strict=True
        """
    
    @classmethod
    def normalize_steno(cls, steno: str, strict: bool = True) -> str:
        """
        Normalize complete steno notation (multiple strokes).
        
        Args:
            steno: Multi-stroke notation to normalize
            strict: Whether to enforce strict validation
            
        Returns:
            Normalized steno notation string
            
        Processes stroke sequences separated by delimiters.
        """
    
    @classmethod
    def steno_to_sort_key(cls, steno: str, strict: bool = True) -> tuple:
        """
        Create sort key for steno notation.
        
        Args:
            steno: Steno notation to create sort key for
            strict: Whether to enforce strict validation
            
        Returns:
            Tuple suitable for sorting steno notations
            
        Enables consistent alphabetical sorting of stenographic notations.
        """
    
    @property
    def steno_keys(self) -> tuple:
        """
        Get stenographic keys in this stroke.
        
        Returns:
            Tuple of key strings in stenographic order
            
        Provides access to the constituent keys of the stroke.
        """
    
    @property  
    def rtfcre(self) -> str:
        """
        Get RTF/CRE format representation.
        
        Returns:
            Stroke in RTF/CRE dictionary format
            
        Converts stroke to format used in RTF stenographic dictionaries.
        """
    
    @property
    def is_correction(self) -> bool:
        """
        Check if stroke is a correction stroke.
        
        Returns:
            True if stroke represents correction/undo operation
            
        Identifies strokes used for undoing previous translations.
        """

Utility Functions

Standalone functions for stenographic data processing and manipulation.

def normalize_stroke(steno: str, strict: bool = True) -> str:
    """
    Normalize individual stroke notation.
    
    Args:
        steno: Stroke notation to normalize  
        strict: Whether to enforce strict validation
        
    Returns:
        Normalized stroke notation
        
    Standalone function for stroke normalization without class context.
    """

def normalize_steno(steno: str, strict: bool = True) -> str:
    """
    Normalize multi-stroke steno notation.
    
    Args:
        steno: Multi-stroke notation to normalize
        strict: Whether to enforce strict validation
        
    Returns:
        Normalized steno notation
        
    Processes complete stenographic phrases with multiple strokes.
    """

def steno_to_sort_key(steno: str, strict: bool = True) -> tuple:
    """
    Create sort key for steno notation.
    
    Args:
        steno: Steno notation to create sort key for
        strict: Whether to enforce strict validation
        
    Returns:
        Tuple for consistent sorting
        
    Enables alphabetical sorting of stenographic entries.
    """

def sort_steno_strokes(strokes_list: list) -> list:
    """
    Sort list of steno strokes alphabetically.
    
    Args:
        strokes_list: List of steno notation strings
        
    Returns:
        Sorted list of steno notations
        
    Uses stenographic sort order rather than ASCII order.
    """

Stenographic Notation Formats

Standard Steno Notation

Basic stenographic notation using key letters.

Format: STKPWHRAO*EUFRPBLGTSDZ Examples:

  • HELLO - Simple stroke
  • STKPW - Multiple consonants
  • AO - Vowel combination
  • * - Asterisk for corrections

Hyphenated Notation

Explicit hyphen notation separating initial and final consonants.

Format: S-T (initial-final) Examples:

  • ST-PB - Initial ST, final PB
  • STKPW-R - Initial STKPW, final R
  • -T - Final consonant only
  • S- - Initial consonant only

RTF/CRE Format

Format used in RTF stenographic dictionaries.

Format: Special escaping and formatting for RTF compatibility Examples:

  • Standard strokes maintain basic format
  • Special characters are escaped
  • Number mode indicated with #

Number Mode

Special notation for numeric input.

Format: # prefix indicates number mode Examples:

  • #S - Number 1
  • #T - Number 2
  • #STKPW - Number 12345

Usage Examples

from plover.steno import Stroke, normalize_stroke, sort_steno_strokes

# Create strokes from different formats
stroke1 = Stroke.from_steno('HELLO')
stroke2 = Stroke.from_steno('ST-PB')
stroke3 = Stroke.from_keys({'S', 'T', 'P', 'B'})

# Access stroke properties
keys = stroke1.steno_keys        # ('H', 'E', 'L', 'L', 'O')
rtf_format = stroke1.rtfcre      # RTF representation
is_undo = stroke1.is_correction  # False for regular strokes

# Normalize steno notation
normalized = normalize_stroke('hello')     # 'HELLO' 
normalized = normalize_stroke('St-pB')     # 'STPB'
normalized = normalize_stroke('S T P B')   # 'STPB'

# Handle multi-stroke notation
multi = Stroke.normalize_steno('HELLO/WORLD')  # 'HELLO/WORLD'

# Create sort keys for alphabetical ordering
sort_key1 = Stroke.steno_to_sort_key('APPLE')
sort_key2 = Stroke.steno_to_sort_key('BANANA') 
sort_key1 < sort_key2  # True - Apple comes before Banana

# Sort stroke lists
strokes = ['WORLD', 'HELLO', 'APPLE', 'BANANA']
sorted_strokes = sort_steno_strokes(strokes)
# Result: ['APPLE', 'BANANA', 'HELLO', 'WORLD']

# Work with correction strokes
undo_stroke = Stroke.from_steno('*')
if undo_stroke.is_correction:
    print("This is an undo stroke")

# Convert between formats
stroke = Stroke.from_steno('STKPW')
keys_set = set(stroke.steno_keys)    # {'S', 'T', 'K', 'P', 'W'}
rtf_representation = stroke.rtfcre   # RTF format string

# Handle number mode
number_stroke = Stroke.from_steno('#STKPW')  # Numbers 12345
number_keys = number_stroke.steno_keys

# Error handling with strict mode
try:
    invalid = normalize_stroke('INVALID_KEYS', strict=True)
except ValueError as e:
    print(f"Invalid stroke: {e}")

# Lenient mode for parsing  
maybe_valid = normalize_stroke('MAYBE_VALID', strict=False)

Stroke System Setup

The stroke system must be configured for the specific stenotype system in use:

from plover.steno import Stroke

# Example setup for English Stenotype system
Stroke.setup(
    keys=('S-', 'T-', 'K-', 'P-', 'W-', 'H-', 'R-', 'A-', 'O-', 
          '*', '-E', '-U', '-F', '-R', '-P', '-B', '-L', '-G', '-T', '-S', '-D', '-Z'),
    implicit_hyphen_keys=frozenset(['A-', 'O-', '-E', '-U', '*']),
    number_key='#',
    numbers={'S-': '1', 'T-': '2', 'P-': '3', 'H-': '4', 'A-': '5', 
             'O-': '0', '-F': '6', '-P': '7', '-L': '8', '-T': '9'},
    feral_number_key=None,
    undo_stroke='*'
)

Stroke Validation and Normalization

Validation Rules

  • Keys must exist in the configured stenotype system
  • Key order must follow stenographic conventions
  • Implicit hyphens are inserted automatically
  • Invalid key combinations are rejected in strict mode

Normalization Process

  1. Case Normalization: Convert to uppercase
  2. Key Ordering: Arrange keys in stenographic order
  3. Hyphen Insertion: Add implicit hyphens where needed
  4. Validation: Check against system constraints
  5. Format Standardization: Apply consistent formatting

Error Handling

# Strict mode - raises exceptions for invalid input
try:
    stroke = Stroke.from_steno('INVALID', strict=True)
except ValueError as e:
    print(f"Invalid stroke: {e}")

# Lenient mode - attempts best-effort parsing
stroke = Stroke.from_steno('maybe_valid', strict=False)
if stroke is None:
    print("Could not parse stroke")

Integration with Stenotype Systems

System Configuration

Different stenotype systems have different key layouts and rules:

  • English Stenotype: Standard 23-key layout
  • Grandjean: Alternative key arrangement
  • Ireland: Modified key layout
  • Michela: Italian stenotype system
  • Custom Systems: User-defined layouts

Key Layout Variations

# English Stenotype standard layout
ENGLISH_KEYS = ('S-', 'T-', 'K-', 'P-', 'W-', 'H-', 'R-', 'A-', 'O-', 
                '*', '-E', '-U', '-F', '-R', '-P', '-B', '-L', '-G', '-T', '-S', '-D', '-Z')

# Custom system example
CUSTOM_KEYS = ('Q-', 'W-', 'E-', 'R-', 'T-', 'A-', 'S-', 
               '*', '-D', '-F', '-G', '-H', '-J', '-K', '-L')

Types

from typing import Set, Tuple, List, Dict, Optional, Union, FrozenSet

StenoKey = str
StenoKeys = Tuple[StenoKey, ...]
StenoKeysSet = Set[StenoKey]
StenoNotation = str
StenoSequence = str

StrokeList = List[Stroke]
StenoList = List[StenoNotation]

KeyLayout = Tuple[StenoKey, ...]
ImplicitHyphenKeys = FrozenSet[StenoKey]
NumberMapping = Dict[StenoKey, str]

SortKey = Tuple[int, ...]
StrokeInteger = int

ValidationResult = Union[StenoNotation, None]
NormalizationResult = StenoNotation

Translation Processing

Core classes for handling stenographic translation from strokes to text output.

class Translation:
    """Data model for mapping between stroke sequences and text strings."""
    
    strokes: List[Stroke]
    rtfcre: Tuple[str, ...]  
    english: str
    replaced: List['Translation']
    formatting: List
    is_retrospective_command: bool
    
    def __init__(self, outline: List[Stroke], translation: str) -> None:
        """
        Create translation from stroke outline and text.
        
        Args:
            outline: List of Stroke objects forming the translation
            translation: Text string result of the translation
            
        Creates translation mapping with formatting state and undo support.
        """
    
    def has_undo(self) -> bool:
        """
        Check if translation can be undone.
        
        Returns:
            True if translation supports undo operation
            
        Determines if translation has formatting state allowing reversal.
        """

class Translator:
    """State machine converting stenographic strokes to translation stream."""
    
    def __init__(self) -> None:
        """Initialize translator with empty state and default dictionary."""
    
    def translate(self, stroke: Stroke) -> List[Translation]:
        """
        Process stroke and return resulting translations.
        
        Args:
            stroke: Stenographic stroke to process
            
        Returns:
            List of translation objects (corrections and new translations)
            
        Maintains translation state and applies greedy matching algorithm.
        """
    
    def set_dictionary(self, dictionary) -> None:
        """
        Set stenographic dictionary for translation lookups.
        
        Args:
            dictionary: StenoDictionaryCollection for translations
            
        Updates translation source and resets internal state.
        """
    
    def add_listener(self, callback) -> None:
        """
        Add callback for translation events.
        
        Args:
            callback: Function receiving translation updates
            
        Registers listener for translation state changes.
        """
    
    def remove_listener(self, callback) -> None:
        """Remove previously added translation listener."""
    
    def set_min_undo_length(self, min_undo_length: int) -> None:
        """
        Set minimum number of strokes kept for undo operations.
        
        Args:
            min_undo_length: Minimum strokes to retain in history
        """

class Formatter:
    """Converts translations into formatted output with proper spacing and capitalization."""
    
    def __init__(self) -> None:
        """Initialize formatter with default output settings."""
    
    def format(self, undo: List[Translation], do: List[Translation], prev: List[Translation]) -> None:
        """
        Format translation sequence with undo and new translations.
        
        Args:
            undo: Translations to undo (backspace operations)
            do: New translations to format and output  
            prev: Previous translation context for formatting state
            
        Processes translation formatting including spacing, capitalization,
        and special formatting commands.
        """
    
    def set_output(self, output) -> None:
        """
        Set output interface for formatted text delivery.
        
        Args:
            output: Output object with send_string, send_backspaces methods
            
        Configures destination for formatted stenographic output.
        """
    
    def add_listener(self, callback) -> None:
        """
        Add listener for formatting events.
        
        Args:
            callback: Function receiving formatting updates
        """
    
    def remove_listener(self, callback) -> None:
        """Remove formatting event listener."""
    
    def set_space_placement(self, placement: str) -> None:
        """
        Configure space placement relative to words.
        
        Args:
            placement: 'Before Output' or 'After Output'
            
        Controls whether spaces appear before or after stenographic output.
        """

Install with Tessl CLI

npx tessl i tessl/pypi-plover

docs

configuration.md

dictionaries.md

engine.md

extensions.md

index.md

machines.md

registry.md

steno-data.md

tile.json