or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

configuration.mddictionaries.mdengine.mdextensions.mdindex.mdmachines.mdregistry.mdsteno-data.md

steno-data.mddocs/

0

# Stenographic Data Models

1

2

Plover's stenographic data system provides core data structures for representing stenographic strokes, translations, and formatting. It includes comprehensive support for stroke normalization, validation, conversion between formats, and integration with various stenotype systems.

3

4

## Capabilities

5

6

### Stroke Representation

7

8

Primary data structure for representing individual stenographic strokes with support for various input formats and normalization.

9

10

```python { .api }

11

class Stroke:

12

"""Primary stenographic stroke representation."""

13

14

PREFIX_STROKE: 'Stroke' = None

15

"""Special prefix stroke for system initialization."""

16

17

UNDO_STROKE: 'Stroke' = None

18

"""Special undo stroke for correction operations."""

19

20

@classmethod

21

def setup(cls, keys: tuple, implicit_hyphen_keys: frozenset,

22

number_key: str, numbers: dict, feral_number_key: str,

23

undo_stroke: str) -> None:

24

"""

25

Setup stroke system with stenotype system parameters.

26

27

Args:

28

keys: Available stenographic keys in order

29

implicit_hyphen_keys: Keys that imply hyphen placement

30

number_key: Key used for number mode

31

numbers: Mapping of keys to numbers

32

feral_number_key: Alternative number key

33

undo_stroke: Stroke pattern for undo operation

34

35

Configures global stroke processing for specific stenotype system.

36

"""

37

38

@classmethod

39

def from_steno(cls, steno: str) -> 'Stroke':

40

"""

41

Create stroke from steno notation string.

42

43

Args:

44

steno: Stenographic notation (e.g., 'STKPW')

45

46

Returns:

47

Stroke instance representing the notation

48

49

Parses various steno notation formats into normalized stroke.

50

"""

51

52

@classmethod

53

def from_keys(cls, keys: set) -> 'Stroke':

54

"""

55

Create stroke from set of pressed keys.

56

57

Args:

58

keys: Set of key strings that were pressed

59

60

Returns:

61

Stroke instance for the key combination

62

63

Converts raw key presses into stenographic stroke.

64

"""

65

66

@classmethod

67

def from_integer(cls, integer: int) -> 'Stroke':

68

"""

69

Create stroke from integer representation.

70

71

Args:

72

integer: Integer with bits representing pressed keys

73

74

Returns:

75

Stroke instance for the bit pattern

76

77

Converts bit-packed stroke data into stroke object.

78

"""

79

80

@classmethod

81

def normalize_stroke(cls, steno: str, strict: bool = True) -> str:

82

"""

83

Normalize stroke notation to standard format.

84

85

Args:

86

steno: Stroke notation to normalize

87

strict: Whether to enforce strict validation

88

89

Returns:

90

Normalized stroke notation string

91

92

Raises:

93

ValueError: If stroke notation is invalid and strict=True

94

"""

95

96

@classmethod

97

def normalize_steno(cls, steno: str, strict: bool = True) -> str:

98

"""

99

Normalize complete steno notation (multiple strokes).

100

101

Args:

102

steno: Multi-stroke notation to normalize

103

strict: Whether to enforce strict validation

104

105

Returns:

106

Normalized steno notation string

107

108

Processes stroke sequences separated by delimiters.

109

"""

110

111

@classmethod

112

def steno_to_sort_key(cls, steno: str, strict: bool = True) -> tuple:

113

"""

114

Create sort key for steno notation.

115

116

Args:

117

steno: Steno notation to create sort key for

118

strict: Whether to enforce strict validation

119

120

Returns:

121

Tuple suitable for sorting steno notations

122

123

Enables consistent alphabetical sorting of stenographic notations.

124

"""

125

126

@property

127

def steno_keys(self) -> tuple:

128

"""

129

Get stenographic keys in this stroke.

130

131

Returns:

132

Tuple of key strings in stenographic order

133

134

Provides access to the constituent keys of the stroke.

135

"""

136

137

@property

138

def rtfcre(self) -> str:

139

"""

140

Get RTF/CRE format representation.

141

142

Returns:

143

Stroke in RTF/CRE dictionary format

144

145

Converts stroke to format used in RTF stenographic dictionaries.

146

"""

147

148

@property

149

def is_correction(self) -> bool:

150

"""

151

Check if stroke is a correction stroke.

152

153

Returns:

154

True if stroke represents correction/undo operation

155

156

Identifies strokes used for undoing previous translations.

157

"""

158

```

159

160

### Utility Functions

161

162

Standalone functions for stenographic data processing and manipulation.

163

164

```python { .api }

165

def normalize_stroke(steno: str, strict: bool = True) -> str:

166

"""

167

Normalize individual stroke notation.

168

169

Args:

170

steno: Stroke notation to normalize

171

strict: Whether to enforce strict validation

172

173

Returns:

174

Normalized stroke notation

175

176

Standalone function for stroke normalization without class context.

177

"""

178

179

def normalize_steno(steno: str, strict: bool = True) -> str:

180

"""

181

Normalize multi-stroke steno notation.

182

183

Args:

184

steno: Multi-stroke notation to normalize

185

strict: Whether to enforce strict validation

186

187

Returns:

188

Normalized steno notation

189

190

Processes complete stenographic phrases with multiple strokes.

191

"""

192

193

def steno_to_sort_key(steno: str, strict: bool = True) -> tuple:

194

"""

195

Create sort key for steno notation.

196

197

Args:

198

steno: Steno notation to create sort key for

199

strict: Whether to enforce strict validation

200

201

Returns:

202

Tuple for consistent sorting

203

204

Enables alphabetical sorting of stenographic entries.

205

"""

206

207

def sort_steno_strokes(strokes_list: list) -> list:

208

"""

209

Sort list of steno strokes alphabetically.

210

211

Args:

212

strokes_list: List of steno notation strings

213

214

Returns:

215

Sorted list of steno notations

216

217

Uses stenographic sort order rather than ASCII order.

218

"""

219

```

220

221

## Stenographic Notation Formats

222

223

### Standard Steno Notation

224

Basic stenographic notation using key letters.

225

226

**Format**: `STKPWHRAO*EUFRPBLGTSDZ`

227

**Examples**:

228

- `HELLO` - Simple stroke

229

- `STKPW` - Multiple consonants

230

- `AO` - Vowel combination

231

- `*` - Asterisk for corrections

232

233

### Hyphenated Notation

234

Explicit hyphen notation separating initial and final consonants.

235

236

**Format**: `S-T` (initial-final)

237

**Examples**:

238

- `ST-PB` - Initial ST, final PB

239

- `STKPW-R` - Initial STKPW, final R

240

- `-T` - Final consonant only

241

- `S-` - Initial consonant only

242

243

### RTF/CRE Format

244

Format used in RTF stenographic dictionaries.

245

246

**Format**: Special escaping and formatting for RTF compatibility

247

**Examples**:

248

- Standard strokes maintain basic format

249

- Special characters are escaped

250

- Number mode indicated with `#`

251

252

### Number Mode

253

Special notation for numeric input.

254

255

**Format**: `#` prefix indicates number mode

256

**Examples**:

257

- `#S` - Number 1

258

- `#T` - Number 2

259

- `#STKPW` - Number 12345

260

261

## Usage Examples

262

263

```python

264

from plover.steno import Stroke, normalize_stroke, sort_steno_strokes

265

266

# Create strokes from different formats

267

stroke1 = Stroke.from_steno('HELLO')

268

stroke2 = Stroke.from_steno('ST-PB')

269

stroke3 = Stroke.from_keys({'S', 'T', 'P', 'B'})

270

271

# Access stroke properties

272

keys = stroke1.steno_keys # ('H', 'E', 'L', 'L', 'O')

273

rtf_format = stroke1.rtfcre # RTF representation

274

is_undo = stroke1.is_correction # False for regular strokes

275

276

# Normalize steno notation

277

normalized = normalize_stroke('hello') # 'HELLO'

278

normalized = normalize_stroke('St-pB') # 'STPB'

279

normalized = normalize_stroke('S T P B') # 'STPB'

280

281

# Handle multi-stroke notation

282

multi = Stroke.normalize_steno('HELLO/WORLD') # 'HELLO/WORLD'

283

284

# Create sort keys for alphabetical ordering

285

sort_key1 = Stroke.steno_to_sort_key('APPLE')

286

sort_key2 = Stroke.steno_to_sort_key('BANANA')

287

sort_key1 < sort_key2 # True - Apple comes before Banana

288

289

# Sort stroke lists

290

strokes = ['WORLD', 'HELLO', 'APPLE', 'BANANA']

291

sorted_strokes = sort_steno_strokes(strokes)

292

# Result: ['APPLE', 'BANANA', 'HELLO', 'WORLD']

293

294

# Work with correction strokes

295

undo_stroke = Stroke.from_steno('*')

296

if undo_stroke.is_correction:

297

print("This is an undo stroke")

298

299

# Convert between formats

300

stroke = Stroke.from_steno('STKPW')

301

keys_set = set(stroke.steno_keys) # {'S', 'T', 'K', 'P', 'W'}

302

rtf_representation = stroke.rtfcre # RTF format string

303

304

# Handle number mode

305

number_stroke = Stroke.from_steno('#STKPW') # Numbers 12345

306

number_keys = number_stroke.steno_keys

307

308

# Error handling with strict mode

309

try:

310

invalid = normalize_stroke('INVALID_KEYS', strict=True)

311

except ValueError as e:

312

print(f"Invalid stroke: {e}")

313

314

# Lenient mode for parsing

315

maybe_valid = normalize_stroke('MAYBE_VALID', strict=False)

316

```

317

318

## Stroke System Setup

319

320

The stroke system must be configured for the specific stenotype system in use:

321

322

```python

323

from plover.steno import Stroke

324

325

# Example setup for English Stenotype system

326

Stroke.setup(

327

keys=('S-', 'T-', 'K-', 'P-', 'W-', 'H-', 'R-', 'A-', 'O-',

328

'*', '-E', '-U', '-F', '-R', '-P', '-B', '-L', '-G', '-T', '-S', '-D', '-Z'),

329

implicit_hyphen_keys=frozenset(['A-', 'O-', '-E', '-U', '*']),

330

number_key='#',

331

numbers={'S-': '1', 'T-': '2', 'P-': '3', 'H-': '4', 'A-': '5',

332

'O-': '0', '-F': '6', '-P': '7', '-L': '8', '-T': '9'},

333

feral_number_key=None,

334

undo_stroke='*'

335

)

336

```

337

338

## Stroke Validation and Normalization

339

340

### Validation Rules

341

- Keys must exist in the configured stenotype system

342

- Key order must follow stenographic conventions

343

- Implicit hyphens are inserted automatically

344

- Invalid key combinations are rejected in strict mode

345

346

### Normalization Process

347

1. **Case Normalization**: Convert to uppercase

348

2. **Key Ordering**: Arrange keys in stenographic order

349

3. **Hyphen Insertion**: Add implicit hyphens where needed

350

4. **Validation**: Check against system constraints

351

5. **Format Standardization**: Apply consistent formatting

352

353

### Error Handling

354

```python

355

# Strict mode - raises exceptions for invalid input

356

try:

357

stroke = Stroke.from_steno('INVALID', strict=True)

358

except ValueError as e:

359

print(f"Invalid stroke: {e}")

360

361

# Lenient mode - attempts best-effort parsing

362

stroke = Stroke.from_steno('maybe_valid', strict=False)

363

if stroke is None:

364

print("Could not parse stroke")

365

```

366

367

## Integration with Stenotype Systems

368

369

### System Configuration

370

Different stenotype systems have different key layouts and rules:

371

372

- **English Stenotype**: Standard 23-key layout

373

- **Grandjean**: Alternative key arrangement

374

- **Ireland**: Modified key layout

375

- **Michela**: Italian stenotype system

376

- **Custom Systems**: User-defined layouts

377

378

### Key Layout Variations

379

```python

380

# English Stenotype standard layout

381

ENGLISH_KEYS = ('S-', 'T-', 'K-', 'P-', 'W-', 'H-', 'R-', 'A-', 'O-',

382

'*', '-E', '-U', '-F', '-R', '-P', '-B', '-L', '-G', '-T', '-S', '-D', '-Z')

383

384

# Custom system example

385

CUSTOM_KEYS = ('Q-', 'W-', 'E-', 'R-', 'T-', 'A-', 'S-',

386

'*', '-D', '-F', '-G', '-H', '-J', '-K', '-L')

387

```

388

389

## Types

390

391

```python { .api }

392

from typing import Set, Tuple, List, Dict, Optional, Union, FrozenSet

393

394

StenoKey = str

395

StenoKeys = Tuple[StenoKey, ...]

396

StenoKeysSet = Set[StenoKey]

397

StenoNotation = str

398

StenoSequence = str

399

400

StrokeList = List[Stroke]

401

StenoList = List[StenoNotation]

402

403

KeyLayout = Tuple[StenoKey, ...]

404

ImplicitHyphenKeys = FrozenSet[StenoKey]

405

NumberMapping = Dict[StenoKey, str]

406

407

SortKey = Tuple[int, ...]

408

StrokeInteger = int

409

410

ValidationResult = Union[StenoNotation, None]

411

NormalizationResult = StenoNotation

412

```

413

414

### Translation Processing

415

416

Core classes for handling stenographic translation from strokes to text output.

417

418

```python { .api }

419

class Translation:

420

"""Data model for mapping between stroke sequences and text strings."""

421

422

strokes: List[Stroke]

423

rtfcre: Tuple[str, ...]

424

english: str

425

replaced: List['Translation']

426

formatting: List

427

is_retrospective_command: bool

428

429

def __init__(self, outline: List[Stroke], translation: str) -> None:

430

"""

431

Create translation from stroke outline and text.

432

433

Args:

434

outline: List of Stroke objects forming the translation

435

translation: Text string result of the translation

436

437

Creates translation mapping with formatting state and undo support.

438

"""

439

440

def has_undo(self) -> bool:

441

"""

442

Check if translation can be undone.

443

444

Returns:

445

True if translation supports undo operation

446

447

Determines if translation has formatting state allowing reversal.

448

"""

449

450

class Translator:

451

"""State machine converting stenographic strokes to translation stream."""

452

453

def __init__(self) -> None:

454

"""Initialize translator with empty state and default dictionary."""

455

456

def translate(self, stroke: Stroke) -> List[Translation]:

457

"""

458

Process stroke and return resulting translations.

459

460

Args:

461

stroke: Stenographic stroke to process

462

463

Returns:

464

List of translation objects (corrections and new translations)

465

466

Maintains translation state and applies greedy matching algorithm.

467

"""

468

469

def set_dictionary(self, dictionary) -> None:

470

"""

471

Set stenographic dictionary for translation lookups.

472

473

Args:

474

dictionary: StenoDictionaryCollection for translations

475

476

Updates translation source and resets internal state.

477

"""

478

479

def add_listener(self, callback) -> None:

480

"""

481

Add callback for translation events.

482

483

Args:

484

callback: Function receiving translation updates

485

486

Registers listener for translation state changes.

487

"""

488

489

def remove_listener(self, callback) -> None:

490

"""Remove previously added translation listener."""

491

492

def set_min_undo_length(self, min_undo_length: int) -> None:

493

"""

494

Set minimum number of strokes kept for undo operations.

495

496

Args:

497

min_undo_length: Minimum strokes to retain in history

498

"""

499

500

class Formatter:

501

"""Converts translations into formatted output with proper spacing and capitalization."""

502

503

def __init__(self) -> None:

504

"""Initialize formatter with default output settings."""

505

506

def format(self, undo: List[Translation], do: List[Translation], prev: List[Translation]) -> None:

507

"""

508

Format translation sequence with undo and new translations.

509

510

Args:

511

undo: Translations to undo (backspace operations)

512

do: New translations to format and output

513

prev: Previous translation context for formatting state

514

515

Processes translation formatting including spacing, capitalization,

516

and special formatting commands.

517

"""

518

519

def set_output(self, output) -> None:

520

"""

521

Set output interface for formatted text delivery.

522

523

Args:

524

output: Output object with send_string, send_backspaces methods

525

526

Configures destination for formatted stenographic output.

527

"""

528

529

def add_listener(self, callback) -> None:

530

"""

531

Add listener for formatting events.

532

533

Args:

534

callback: Function receiving formatting updates

535

"""

536

537

def remove_listener(self, callback) -> None:

538

"""Remove formatting event listener."""

539

540

def set_space_placement(self, placement: str) -> None:

541

"""

542

Configure space placement relative to words.

543

544

Args:

545

placement: 'Before Output' or 'After Output'

546

547

Controls whether spaces appear before or after stenographic output.

548

"""

549

```