0
# Dictionary System
1
2
Plover's dictionary system provides powerful translation management with support for multiple formats, hierarchical precedence, filtering, and real-time updates. It enables efficient lookup of stenographic translations and supports both individual dictionaries and collections with sophisticated precedence rules.
3
4
## Capabilities
5
6
### Individual Dictionary Management
7
8
Base dictionary class providing core translation storage and lookup capabilities with support for various file formats.
9
10
```python { .api }
11
class StenoDictionary:
12
"""Base class for stenographic dictionaries."""
13
14
readonly: bool = False
15
"""Class attribute indicating if dictionary format is read-only."""
16
17
def __init__(self):
18
"""
19
Initialize empty dictionary.
20
21
Creates dictionary instance ready for loading or manual population.
22
"""
23
24
@classmethod
25
def create(cls, resource: str) -> 'StenoDictionary':
26
"""
27
Create new dictionary at specified location.
28
29
Args:
30
resource: Path or resource identifier for new dictionary
31
32
Returns:
33
New StenoDictionary instance
34
35
Creates empty dictionary file and returns dictionary instance.
36
"""
37
38
@classmethod
39
def load(cls, resource: str) -> 'StenoDictionary':
40
"""
41
Load existing dictionary from file.
42
43
Args:
44
resource: Path or resource identifier for dictionary
45
46
Returns:
47
Loaded StenoDictionary instance
48
49
Raises:
50
FileNotFoundError: If dictionary file doesn't exist
51
ValueError: If dictionary format is invalid
52
"""
53
54
def save(self) -> None:
55
"""
56
Save dictionary to file.
57
58
Writes current dictionary contents to associated file,
59
creating backup if necessary.
60
61
Raises:
62
PermissionError: If dictionary is readonly
63
IOError: If file cannot be written
64
"""
65
66
@property
67
def longest_key(self) -> int:
68
"""
69
Get length of longest stroke sequence.
70
71
Returns:
72
Maximum number of strokes in any dictionary entry
73
74
Used for optimizing translation processing.
75
"""
76
77
def clear(self) -> None:
78
"""
79
Remove all entries from dictionary.
80
81
Clears dictionary contents but retains file association.
82
"""
83
84
def items(self) -> list:
85
"""
86
Get all dictionary entries.
87
88
Returns:
89
List of (strokes_tuple, translation) pairs
90
91
Provides access to complete dictionary contents.
92
"""
93
94
def update(self, *args, **kwargs) -> None:
95
"""
96
Update dictionary with new entries.
97
98
Args:
99
*args: Dictionary or iterable of (key, value) pairs
100
**kwargs: Keyword arguments as key-value pairs
101
102
Adds or updates multiple entries efficiently.
103
"""
104
105
def get(self, key: tuple, fallback=None):
106
"""
107
Get translation with fallback value.
108
109
Args:
110
key: Tuple of stroke strings to look up
111
fallback: Value to return if key not found
112
113
Returns:
114
Translation string or fallback value
115
"""
116
117
def reverse_lookup(self, value: str) -> list:
118
"""
119
Find stroke sequences for translation.
120
121
Args:
122
value: Translation text to search for
123
124
Returns:
125
List of stroke tuples that produce the translation
126
127
Searches all entries to find matching translations.
128
"""
129
130
def casereverse_lookup(self, value: str) -> list:
131
"""
132
Case-insensitive reverse lookup.
133
134
Args:
135
value: Translation text to search for (case-insensitive)
136
137
Returns:
138
List of stroke tuples for case-insensitive matches
139
"""
140
```
141
142
### Dictionary Interface Methods
143
144
Standard dictionary-like interface for direct access to translations.
145
146
```python { .api }
147
def __len__(self) -> int:
148
"""
149
Get number of entries in dictionary.
150
151
Returns:
152
Count of translation entries
153
"""
154
155
def __iter__(self):
156
"""
157
Iterate over stroke sequences.
158
159
Yields:
160
Stroke tuples for all entries
161
"""
162
163
def __getitem__(self, key: tuple) -> str:
164
"""
165
Get translation for stroke sequence.
166
167
Args:
168
key: Tuple of stroke strings
169
170
Returns:
171
Translation string
172
173
Raises:
174
KeyError: If stroke sequence not found
175
"""
176
177
def __setitem__(self, key: tuple, value: str) -> None:
178
"""
179
Set translation for stroke sequence.
180
181
Args:
182
key: Tuple of stroke strings
183
value: Translation text
184
185
Adds or updates dictionary entry.
186
"""
187
188
def __delitem__(self, key: tuple) -> None:
189
"""
190
Delete translation entry.
191
192
Args:
193
key: Tuple of stroke strings to remove
194
195
Raises:
196
KeyError: If stroke sequence not found
197
"""
198
199
def __contains__(self, key: tuple) -> bool:
200
"""
201
Check if stroke sequence exists.
202
203
Args:
204
key: Tuple of stroke strings to check
205
206
Returns:
207
True if stroke sequence has translation
208
"""
209
```
210
211
### Dictionary Collection Management
212
213
Collection class managing multiple dictionaries with precedence rules and filtering capabilities.
214
215
```python { .api }
216
class StenoDictionaryCollection:
217
"""Collection of dictionaries with precedence and filtering."""
218
219
def __init__(self, dicts: list = []):
220
"""
221
Initialize dictionary collection.
222
223
Args:
224
dicts: List of StenoDictionary instances in precedence order
225
226
Higher precedence dictionaries appear earlier in list.
227
"""
228
229
@property
230
def longest_key(self) -> int:
231
"""
232
Get longest key across all dictionaries.
233
234
Returns:
235
Maximum stroke sequence length across collection
236
"""
237
238
def set_dicts(self, dicts: list) -> None:
239
"""
240
Set dictionary list with precedence order.
241
242
Args:
243
dicts: List of StenoDictionary instances
244
245
Replaces current dictionary collection.
246
"""
247
248
def lookup(self, key: tuple) -> str:
249
"""
250
Look up translation with precedence and filters.
251
252
Args:
253
key: Tuple of stroke strings
254
255
Returns:
256
Translation from highest precedence dictionary
257
258
Searches dictionaries in order, applying filters.
259
"""
260
261
def raw_lookup(self, key: tuple) -> str:
262
"""
263
Look up translation without filters.
264
265
Args:
266
key: Tuple of stroke strings
267
268
Returns:
269
Raw translation from highest precedence dictionary
270
271
Bypasses all dictionary filters.
272
"""
273
274
def lookup_from_all(self, key: tuple) -> list:
275
"""
276
Look up from all dictionaries.
277
278
Args:
279
key: Tuple of stroke strings
280
281
Returns:
282
List of (dictionary_path, translation) tuples
283
284
Returns matches from all dictionaries regardless of precedence.
285
"""
286
287
def raw_lookup_from_all(self, key: tuple) -> list:
288
"""
289
Raw lookup from all dictionaries.
290
291
Args:
292
key: Tuple of stroke strings
293
294
Returns:
295
List of (dictionary_path, translation) tuples
296
297
Bypasses filters and returns all matches.
298
"""
299
300
def reverse_lookup(self, value: str) -> list:
301
"""
302
Reverse lookup across all dictionaries.
303
304
Args:
305
value: Translation text to find strokes for
306
307
Returns:
308
List of stroke tuples from all dictionaries
309
"""
310
311
def casereverse_lookup(self, value: str) -> list:
312
"""
313
Case-insensitive reverse lookup across all dictionaries.
314
315
Args:
316
value: Translation text (case-insensitive)
317
318
Returns:
319
List of stroke tuples for case-insensitive matches
320
"""
321
322
def first_writable(self) -> StenoDictionary:
323
"""
324
Get first writable dictionary in collection.
325
326
Returns:
327
First dictionary that is not readonly
328
329
Raises:
330
ValueError: If no writable dictionaries available
331
332
Used for adding new translations.
333
"""
334
335
def set(self, key: tuple, value: str, path: str = None) -> None:
336
"""
337
Set translation in specified or first writable dictionary.
338
339
Args:
340
key: Tuple of stroke strings
341
value: Translation text
342
path: Specific dictionary path, uses first writable if None
343
344
Adds translation to specified dictionary or first writable.
345
"""
346
347
def save(self, path_list: list = None) -> None:
348
"""
349
Save dictionaries to files.
350
351
Args:
352
path_list: List of paths to save, saves all if None
353
354
Saves specified dictionaries or all writable dictionaries.
355
"""
356
357
def get(self, path: str) -> StenoDictionary:
358
"""
359
Get dictionary by file path.
360
361
Args:
362
path: File path of dictionary to retrieve
363
364
Returns:
365
StenoDictionary instance for specified path
366
367
Raises:
368
KeyError: If dictionary not found
369
"""
370
371
def __getitem__(self, path: str) -> StenoDictionary:
372
"""
373
Get dictionary by path using subscript notation.
374
375
Args:
376
path: File path of dictionary
377
378
Returns:
379
StenoDictionary instance
380
"""
381
382
def __iter__(self):
383
"""
384
Iterate over all dictionaries.
385
386
Yields:
387
StenoDictionary instances in precedence order
388
"""
389
```
390
391
### Dictionary Filtering
392
393
Filter system for modifying dictionary lookup behavior with custom logic.
394
395
```python { .api }
396
def add_filter(self, f) -> None:
397
"""
398
Add dictionary filter function.
399
400
Args:
401
f: Filter function taking (strokes, translation) -> bool
402
403
Filter functions can modify or reject translations during lookup.
404
"""
405
406
def remove_filter(self, f) -> None:
407
"""
408
Remove dictionary filter function.
409
410
Args:
411
f: Filter function to remove
412
413
Removes previously added filter from the processing chain.
414
"""
415
```
416
417
### Dictionary Loading Functions
418
419
Utility functions for creating and loading dictionaries with format detection.
420
421
```python { .api }
422
def create_dictionary(resource: str, threaded_save: bool = True) -> StenoDictionary:
423
"""
424
Create new dictionary with format detection.
425
426
Args:
427
resource: Path or resource identifier for dictionary
428
threaded_save: Whether to use threaded saving for performance
429
430
Returns:
431
New StenoDictionary instance of appropriate format
432
433
Detects format from file extension and creates appropriate dictionary type.
434
"""
435
436
def load_dictionary(resource: str, threaded_save: bool = True) -> StenoDictionary:
437
"""
438
Load dictionary with automatic format detection.
439
440
Args:
441
resource: Path or resource identifier for dictionary
442
threaded_save: Whether to use threaded saving for performance
443
444
Returns:
445
Loaded StenoDictionary instance of detected format
446
447
Automatically determines format and creates appropriate dictionary instance.
448
"""
449
```
450
451
## Supported Dictionary Formats
452
453
### JSON Dictionary Format
454
Standard JSON format with stroke tuples as keys and translations as values.
455
456
**File Extension**: `.json`
457
**Format**: `{"STROKE/SEQUENCE": "translation"}`
458
**Characteristics**: Human-readable, easily editable, full Unicode support
459
460
### RTF/CRE Dictionary Format
461
Rich Text Format adapted for stenographic dictionaries.
462
463
**File Extension**: `.rtf`
464
**Format**: RTF document with embedded stenographic data
465
**Characteristics**: Compatible with commercial stenography software
466
467
## Usage Examples
468
469
```python
470
from plover.steno_dictionary import StenoDictionary, StenoDictionaryCollection
471
from plover.dictionary.base import create_dictionary, load_dictionary
472
473
# Create new dictionary
474
new_dict = create_dictionary('/path/to/new_dict.json')
475
new_dict[('H', 'E', 'L', 'O')] = 'hello'
476
new_dict.save()
477
478
# Load existing dictionary
479
existing_dict = load_dictionary('/path/to/existing_dict.json')
480
translation = existing_dict[('W', 'O', 'R', 'L', 'D')]
481
482
# Work with dictionary collection
483
dict1 = load_dictionary('/path/to/main.json')
484
dict2 = load_dictionary('/path/to/user.json')
485
collection = StenoDictionaryCollection([dict1, dict2])
486
487
# Look up translations
488
translation = collection.lookup(('T', 'E', 'S', 'T'))
489
all_matches = collection.lookup_from_all(('T', 'E', 'S', 'T'))
490
491
# Reverse lookup
492
strokes = collection.reverse_lookup('hello')
493
# Result: [('H', 'E', 'L', 'O'), ('H', 'E', 'L', '*')]
494
495
# Add new translation
496
collection.set(('K', 'U', 'S', 'T', 'O', 'M'), 'custom')
497
498
# Add dictionary filter
499
def filter_short_translations(strokes, translation):
500
return len(translation) > 2
501
502
collection.add_filter(filter_short_translations)
503
504
# Dictionary operations
505
print(f"Dictionary has {len(dict1)} entries")
506
print(f"Longest stroke sequence: {dict1.longest_key}")
507
508
# Iterate over entries
509
for strokes, translation in dict1.items():
510
print(f"{'/'.join(strokes)} -> {translation}")
511
512
# Check for entries
513
if ('T', 'E', 'S', 'T') in dict1:
514
print("Test entry exists")
515
516
# Update multiple entries
517
dict1.update({
518
('O', 'N', 'E'): 'one',
519
('T', 'W', 'O'): 'two',
520
('T', 'H', 'R', 'E', 'E'): 'three'
521
})
522
523
# Save changes
524
dict1.save()
525
collection.save() # Saves all writable dictionaries
526
```
527
528
## Dictionary Precedence
529
530
In dictionary collections, precedence determines which translation is returned when multiple dictionaries contain the same stroke sequence:
531
532
1. **First Match Wins**: The first dictionary in the collection list that contains a translation wins
533
2. **User Dictionaries First**: Typically user dictionaries are placed before system dictionaries
534
3. **Specific Before General**: More specific dictionaries should precede general ones
535
536
```python
537
# Precedence example
538
main_dict = load_dictionary('main.json') # Contains: TEST -> "test"
539
user_dict = load_dictionary('user.json') # Contains: TEST -> "examination"
540
541
# User dictionary first = user translation wins
542
collection = StenoDictionaryCollection([user_dict, main_dict])
543
result = collection.lookup(('T', 'E', 'S', 'T')) # Returns "examination"
544
545
# Main dictionary first = main translation wins
546
collection = StenoDictionaryCollection([main_dict, user_dict])
547
result = collection.lookup(('T', 'E', 'S', 'T')) # Returns "test"
548
```
549
550
## Dictionary Filtering
551
552
Filters allow modification of dictionary behavior without changing dictionary files:
553
554
```python
555
def uppercase_filter(strokes, translation):
556
"""Convert all translations to uppercase."""
557
return translation.upper()
558
559
def length_filter(strokes, translation):
560
"""Only allow translations longer than 3 characters."""
561
return translation if len(translation) > 3 else None
562
563
def stroke_count_filter(strokes, translation):
564
"""Only allow single-stroke entries."""
565
return translation if len(strokes) == 1 else None
566
567
collection.add_filter(uppercase_filter)
568
collection.add_filter(length_filter)
569
```
570
571
## Types
572
573
```python { .api }
574
from typing import Dict, List, Tuple, Optional, Union, Callable, Any
575
from pathlib import Path
576
577
StrokeSequence = Tuple[str, ...]
578
Translation = str
579
DictionaryEntry = Tuple[StrokeSequence, Translation]
580
DictionaryItems = List[DictionaryEntry]
581
582
DictionaryPath = Union[str, Path]
583
DictionaryResource = Union[str, Path]
584
585
FilterFunction = Callable[[StrokeSequence, Translation], Optional[Translation]]
586
FilterList = List[FilterFunction]
587
588
LookupResult = Optional[Translation]
589
LookupResults = List[Tuple[DictionaryPath, Translation]]
590
ReverseLookupResults = List[StrokeSequence]
591
592
DictionaryList = List[StenoDictionary]
593
DictionaryDict = Dict[DictionaryPath, StenoDictionary]
594
```