0
# Stenographic Data Models
1
2
Plover's stenographic data system provides core data structures for representing stenographic strokes, translations, and formatting. It includes comprehensive support for stroke normalization, validation, conversion between formats, and integration with various stenotype systems.
3
4
## Capabilities
5
6
### Stroke Representation
7
8
Primary data structure for representing individual stenographic strokes with support for various input formats and normalization.
9
10
```python { .api }
11
class Stroke:
12
"""Primary stenographic stroke representation."""
13
14
PREFIX_STROKE: 'Stroke' = None
15
"""Special prefix stroke for system initialization."""
16
17
UNDO_STROKE: 'Stroke' = None
18
"""Special undo stroke for correction operations."""
19
20
@classmethod
21
def setup(cls, keys: tuple, implicit_hyphen_keys: frozenset,
22
number_key: str, numbers: dict, feral_number_key: str,
23
undo_stroke: str) -> None:
24
"""
25
Setup stroke system with stenotype system parameters.
26
27
Args:
28
keys: Available stenographic keys in order
29
implicit_hyphen_keys: Keys that imply hyphen placement
30
number_key: Key used for number mode
31
numbers: Mapping of keys to numbers
32
feral_number_key: Alternative number key
33
undo_stroke: Stroke pattern for undo operation
34
35
Configures global stroke processing for specific stenotype system.
36
"""
37
38
@classmethod
39
def from_steno(cls, steno: str) -> 'Stroke':
40
"""
41
Create stroke from steno notation string.
42
43
Args:
44
steno: Stenographic notation (e.g., 'STKPW')
45
46
Returns:
47
Stroke instance representing the notation
48
49
Parses various steno notation formats into normalized stroke.
50
"""
51
52
@classmethod
53
def from_keys(cls, keys: set) -> 'Stroke':
54
"""
55
Create stroke from set of pressed keys.
56
57
Args:
58
keys: Set of key strings that were pressed
59
60
Returns:
61
Stroke instance for the key combination
62
63
Converts raw key presses into stenographic stroke.
64
"""
65
66
@classmethod
67
def from_integer(cls, integer: int) -> 'Stroke':
68
"""
69
Create stroke from integer representation.
70
71
Args:
72
integer: Integer with bits representing pressed keys
73
74
Returns:
75
Stroke instance for the bit pattern
76
77
Converts bit-packed stroke data into stroke object.
78
"""
79
80
@classmethod
81
def normalize_stroke(cls, steno: str, strict: bool = True) -> str:
82
"""
83
Normalize stroke notation to standard format.
84
85
Args:
86
steno: Stroke notation to normalize
87
strict: Whether to enforce strict validation
88
89
Returns:
90
Normalized stroke notation string
91
92
Raises:
93
ValueError: If stroke notation is invalid and strict=True
94
"""
95
96
@classmethod
97
def normalize_steno(cls, steno: str, strict: bool = True) -> str:
98
"""
99
Normalize complete steno notation (multiple strokes).
100
101
Args:
102
steno: Multi-stroke notation to normalize
103
strict: Whether to enforce strict validation
104
105
Returns:
106
Normalized steno notation string
107
108
Processes stroke sequences separated by delimiters.
109
"""
110
111
@classmethod
112
def steno_to_sort_key(cls, steno: str, strict: bool = True) -> tuple:
113
"""
114
Create sort key for steno notation.
115
116
Args:
117
steno: Steno notation to create sort key for
118
strict: Whether to enforce strict validation
119
120
Returns:
121
Tuple suitable for sorting steno notations
122
123
Enables consistent alphabetical sorting of stenographic notations.
124
"""
125
126
@property
127
def steno_keys(self) -> tuple:
128
"""
129
Get stenographic keys in this stroke.
130
131
Returns:
132
Tuple of key strings in stenographic order
133
134
Provides access to the constituent keys of the stroke.
135
"""
136
137
@property
138
def rtfcre(self) -> str:
139
"""
140
Get RTF/CRE format representation.
141
142
Returns:
143
Stroke in RTF/CRE dictionary format
144
145
Converts stroke to format used in RTF stenographic dictionaries.
146
"""
147
148
@property
149
def is_correction(self) -> bool:
150
"""
151
Check if stroke is a correction stroke.
152
153
Returns:
154
True if stroke represents correction/undo operation
155
156
Identifies strokes used for undoing previous translations.
157
"""
158
```
159
160
### Utility Functions
161
162
Standalone functions for stenographic data processing and manipulation.
163
164
```python { .api }
165
def normalize_stroke(steno: str, strict: bool = True) -> str:
166
"""
167
Normalize individual stroke notation.
168
169
Args:
170
steno: Stroke notation to normalize
171
strict: Whether to enforce strict validation
172
173
Returns:
174
Normalized stroke notation
175
176
Standalone function for stroke normalization without class context.
177
"""
178
179
def normalize_steno(steno: str, strict: bool = True) -> str:
180
"""
181
Normalize multi-stroke steno notation.
182
183
Args:
184
steno: Multi-stroke notation to normalize
185
strict: Whether to enforce strict validation
186
187
Returns:
188
Normalized steno notation
189
190
Processes complete stenographic phrases with multiple strokes.
191
"""
192
193
def steno_to_sort_key(steno: str, strict: bool = True) -> tuple:
194
"""
195
Create sort key for steno notation.
196
197
Args:
198
steno: Steno notation to create sort key for
199
strict: Whether to enforce strict validation
200
201
Returns:
202
Tuple for consistent sorting
203
204
Enables alphabetical sorting of stenographic entries.
205
"""
206
207
def sort_steno_strokes(strokes_list: list) -> list:
208
"""
209
Sort list of steno strokes alphabetically.
210
211
Args:
212
strokes_list: List of steno notation strings
213
214
Returns:
215
Sorted list of steno notations
216
217
Uses stenographic sort order rather than ASCII order.
218
"""
219
```
220
221
## Stenographic Notation Formats
222
223
### Standard Steno Notation
224
Basic stenographic notation using key letters.
225
226
**Format**: `STKPWHRAO*EUFRPBLGTSDZ`
227
**Examples**:
228
- `HELLO` - Simple stroke
229
- `STKPW` - Multiple consonants
230
- `AO` - Vowel combination
231
- `*` - Asterisk for corrections
232
233
### Hyphenated Notation
234
Explicit hyphen notation separating initial and final consonants.
235
236
**Format**: `S-T` (initial-final)
237
**Examples**:
238
- `ST-PB` - Initial ST, final PB
239
- `STKPW-R` - Initial STKPW, final R
240
- `-T` - Final consonant only
241
- `S-` - Initial consonant only
242
243
### RTF/CRE Format
244
Format used in RTF stenographic dictionaries.
245
246
**Format**: Special escaping and formatting for RTF compatibility
247
**Examples**:
248
- Standard strokes maintain basic format
249
- Special characters are escaped
250
- Number mode indicated with `#`
251
252
### Number Mode
253
Special notation for numeric input.
254
255
**Format**: `#` prefix indicates number mode
256
**Examples**:
257
- `#S` - Number 1
258
- `#T` - Number 2
259
- `#STKPW` - Number 12345
260
261
## Usage Examples
262
263
```python
264
from plover.steno import Stroke, normalize_stroke, sort_steno_strokes
265
266
# Create strokes from different formats
267
stroke1 = Stroke.from_steno('HELLO')
268
stroke2 = Stroke.from_steno('ST-PB')
269
stroke3 = Stroke.from_keys({'S', 'T', 'P', 'B'})
270
271
# Access stroke properties
272
keys = stroke1.steno_keys # ('H', 'E', 'L', 'L', 'O')
273
rtf_format = stroke1.rtfcre # RTF representation
274
is_undo = stroke1.is_correction # False for regular strokes
275
276
# Normalize steno notation
277
normalized = normalize_stroke('hello') # 'HELLO'
278
normalized = normalize_stroke('St-pB') # 'STPB'
279
normalized = normalize_stroke('S T P B') # 'STPB'
280
281
# Handle multi-stroke notation
282
multi = Stroke.normalize_steno('HELLO/WORLD') # 'HELLO/WORLD'
283
284
# Create sort keys for alphabetical ordering
285
sort_key1 = Stroke.steno_to_sort_key('APPLE')
286
sort_key2 = Stroke.steno_to_sort_key('BANANA')
287
sort_key1 < sort_key2 # True - Apple comes before Banana
288
289
# Sort stroke lists
290
strokes = ['WORLD', 'HELLO', 'APPLE', 'BANANA']
291
sorted_strokes = sort_steno_strokes(strokes)
292
# Result: ['APPLE', 'BANANA', 'HELLO', 'WORLD']
293
294
# Work with correction strokes
295
undo_stroke = Stroke.from_steno('*')
296
if undo_stroke.is_correction:
297
print("This is an undo stroke")
298
299
# Convert between formats
300
stroke = Stroke.from_steno('STKPW')
301
keys_set = set(stroke.steno_keys) # {'S', 'T', 'K', 'P', 'W'}
302
rtf_representation = stroke.rtfcre # RTF format string
303
304
# Handle number mode
305
number_stroke = Stroke.from_steno('#STKPW') # Numbers 12345
306
number_keys = number_stroke.steno_keys
307
308
# Error handling with strict mode
309
try:
310
invalid = normalize_stroke('INVALID_KEYS', strict=True)
311
except ValueError as e:
312
print(f"Invalid stroke: {e}")
313
314
# Lenient mode for parsing
315
maybe_valid = normalize_stroke('MAYBE_VALID', strict=False)
316
```
317
318
## Stroke System Setup
319
320
The stroke system must be configured for the specific stenotype system in use:
321
322
```python
323
from plover.steno import Stroke
324
325
# Example setup for English Stenotype system
326
Stroke.setup(
327
keys=('S-', 'T-', 'K-', 'P-', 'W-', 'H-', 'R-', 'A-', 'O-',
328
'*', '-E', '-U', '-F', '-R', '-P', '-B', '-L', '-G', '-T', '-S', '-D', '-Z'),
329
implicit_hyphen_keys=frozenset(['A-', 'O-', '-E', '-U', '*']),
330
number_key='#',
331
numbers={'S-': '1', 'T-': '2', 'P-': '3', 'H-': '4', 'A-': '5',
332
'O-': '0', '-F': '6', '-P': '7', '-L': '8', '-T': '9'},
333
feral_number_key=None,
334
undo_stroke='*'
335
)
336
```
337
338
## Stroke Validation and Normalization
339
340
### Validation Rules
341
- Keys must exist in the configured stenotype system
342
- Key order must follow stenographic conventions
343
- Implicit hyphens are inserted automatically
344
- Invalid key combinations are rejected in strict mode
345
346
### Normalization Process
347
1. **Case Normalization**: Convert to uppercase
348
2. **Key Ordering**: Arrange keys in stenographic order
349
3. **Hyphen Insertion**: Add implicit hyphens where needed
350
4. **Validation**: Check against system constraints
351
5. **Format Standardization**: Apply consistent formatting
352
353
### Error Handling
354
```python
355
# Strict mode - raises exceptions for invalid input
356
try:
357
stroke = Stroke.from_steno('INVALID', strict=True)
358
except ValueError as e:
359
print(f"Invalid stroke: {e}")
360
361
# Lenient mode - attempts best-effort parsing
362
stroke = Stroke.from_steno('maybe_valid', strict=False)
363
if stroke is None:
364
print("Could not parse stroke")
365
```
366
367
## Integration with Stenotype Systems
368
369
### System Configuration
370
Different stenotype systems have different key layouts and rules:
371
372
- **English Stenotype**: Standard 23-key layout
373
- **Grandjean**: Alternative key arrangement
374
- **Ireland**: Modified key layout
375
- **Michela**: Italian stenotype system
376
- **Custom Systems**: User-defined layouts
377
378
### Key Layout Variations
379
```python
380
# English Stenotype standard layout
381
ENGLISH_KEYS = ('S-', 'T-', 'K-', 'P-', 'W-', 'H-', 'R-', 'A-', 'O-',
382
'*', '-E', '-U', '-F', '-R', '-P', '-B', '-L', '-G', '-T', '-S', '-D', '-Z')
383
384
# Custom system example
385
CUSTOM_KEYS = ('Q-', 'W-', 'E-', 'R-', 'T-', 'A-', 'S-',
386
'*', '-D', '-F', '-G', '-H', '-J', '-K', '-L')
387
```
388
389
## Types
390
391
```python { .api }
392
from typing import Set, Tuple, List, Dict, Optional, Union, FrozenSet
393
394
StenoKey = str
395
StenoKeys = Tuple[StenoKey, ...]
396
StenoKeysSet = Set[StenoKey]
397
StenoNotation = str
398
StenoSequence = str
399
400
StrokeList = List[Stroke]
401
StenoList = List[StenoNotation]
402
403
KeyLayout = Tuple[StenoKey, ...]
404
ImplicitHyphenKeys = FrozenSet[StenoKey]
405
NumberMapping = Dict[StenoKey, str]
406
407
SortKey = Tuple[int, ...]
408
StrokeInteger = int
409
410
ValidationResult = Union[StenoNotation, None]
411
NormalizationResult = StenoNotation
412
```
413
414
### Translation Processing
415
416
Core classes for handling stenographic translation from strokes to text output.
417
418
```python { .api }
419
class Translation:
420
"""Data model for mapping between stroke sequences and text strings."""
421
422
strokes: List[Stroke]
423
rtfcre: Tuple[str, ...]
424
english: str
425
replaced: List['Translation']
426
formatting: List
427
is_retrospective_command: bool
428
429
def __init__(self, outline: List[Stroke], translation: str) -> None:
430
"""
431
Create translation from stroke outline and text.
432
433
Args:
434
outline: List of Stroke objects forming the translation
435
translation: Text string result of the translation
436
437
Creates translation mapping with formatting state and undo support.
438
"""
439
440
def has_undo(self) -> bool:
441
"""
442
Check if translation can be undone.
443
444
Returns:
445
True if translation supports undo operation
446
447
Determines if translation has formatting state allowing reversal.
448
"""
449
450
class Translator:
451
"""State machine converting stenographic strokes to translation stream."""
452
453
def __init__(self) -> None:
454
"""Initialize translator with empty state and default dictionary."""
455
456
def translate(self, stroke: Stroke) -> List[Translation]:
457
"""
458
Process stroke and return resulting translations.
459
460
Args:
461
stroke: Stenographic stroke to process
462
463
Returns:
464
List of translation objects (corrections and new translations)
465
466
Maintains translation state and applies greedy matching algorithm.
467
"""
468
469
def set_dictionary(self, dictionary) -> None:
470
"""
471
Set stenographic dictionary for translation lookups.
472
473
Args:
474
dictionary: StenoDictionaryCollection for translations
475
476
Updates translation source and resets internal state.
477
"""
478
479
def add_listener(self, callback) -> None:
480
"""
481
Add callback for translation events.
482
483
Args:
484
callback: Function receiving translation updates
485
486
Registers listener for translation state changes.
487
"""
488
489
def remove_listener(self, callback) -> None:
490
"""Remove previously added translation listener."""
491
492
def set_min_undo_length(self, min_undo_length: int) -> None:
493
"""
494
Set minimum number of strokes kept for undo operations.
495
496
Args:
497
min_undo_length: Minimum strokes to retain in history
498
"""
499
500
class Formatter:
501
"""Converts translations into formatted output with proper spacing and capitalization."""
502
503
def __init__(self) -> None:
504
"""Initialize formatter with default output settings."""
505
506
def format(self, undo: List[Translation], do: List[Translation], prev: List[Translation]) -> None:
507
"""
508
Format translation sequence with undo and new translations.
509
510
Args:
511
undo: Translations to undo (backspace operations)
512
do: New translations to format and output
513
prev: Previous translation context for formatting state
514
515
Processes translation formatting including spacing, capitalization,
516
and special formatting commands.
517
"""
518
519
def set_output(self, output) -> None:
520
"""
521
Set output interface for formatted text delivery.
522
523
Args:
524
output: Output object with send_string, send_backspaces methods
525
526
Configures destination for formatted stenographic output.
527
"""
528
529
def add_listener(self, callback) -> None:
530
"""
531
Add listener for formatting events.
532
533
Args:
534
callback: Function receiving formatting updates
535
"""
536
537
def remove_listener(self, callback) -> None:
538
"""Remove formatting event listener."""
539
540
def set_space_placement(self, placement: str) -> None:
541
"""
542
Configure space placement relative to words.
543
544
Args:
545
placement: 'Before Output' or 'After Output'
546
547
Controls whether spaces appear before or after stenographic output.
548
"""
549
```