0
# Configuration and Parameters
1
2
Extensive configuration system providing fine-grained control over OCR behavior including page segmentation modes, OCR engine modes, variable settings, language management, and performance tuning options.
3
4
## Capabilities
5
6
### Page Segmentation Modes
7
8
Control how Tesseract analyzes page layout and identifies text regions.
9
10
```java { .api }
11
// Page Segmentation Mode Constants
12
public static final int PSM_OSD_ONLY = 0; // Orientation and script detection only
13
public static final int PSM_AUTO_OSD = 1; // Automatic page segmentation with OSD
14
public static final int PSM_AUTO_ONLY = 2; // Automatic page segmentation, no OSD
15
public static final int PSM_AUTO = 3; // Fully automatic page segmentation (default)
16
public static final int PSM_SINGLE_COLUMN = 4; // Single column of text
17
public static final int PSM_SINGLE_BLOCK_VERT_TEXT = 5; // Single uniform block of vertical text
18
public static final int PSM_SINGLE_BLOCK = 6; // Single uniform block of text
19
public static final int PSM_SINGLE_LINE = 7; // Single text line
20
public static final int PSM_SINGLE_WORD = 8; // Single word
21
public static final int PSM_CIRCLE_WORD = 9; // Single word in a circle
22
public static final int PSM_SINGLE_CHAR = 10; // Single character
23
public static final int PSM_SPARSE_TEXT = 11; // Sparse text in no particular order
24
public static final int PSM_SPARSE_TEXT_OSD = 12; // Sparse text with OSD
25
public static final int PSM_RAW_LINE = 13; // Raw line, bypass word detection
26
27
/**
28
* Set page segmentation mode
29
* @param mode PSM constant (PSM_AUTO, PSM_SINGLE_BLOCK, etc.)
30
*/
31
public void SetPageSegMode(int mode);
32
33
/**
34
* Get current page segmentation mode
35
* @return Current PSM mode
36
*/
37
public int GetPageSegMode();
38
```
39
40
**Page Segmentation Example:**
41
42
```java
43
import static org.bytedeco.tesseract.global.tesseract.*;
44
45
TessBaseAPI api = new TessBaseAPI();
46
api.Init(null, "eng");
47
48
// Configure for single line of text (faster, more accurate for simple cases)
49
api.SetPageSegMode(PSM_SINGLE_LINE);
50
51
// Configure for automatic layout detection (good for complex documents)
52
api.SetPageSegMode(PSM_AUTO);
53
54
// Configure for single word (useful for form fields)
55
api.SetPageSegMode(PSM_SINGLE_WORD);
56
57
PIX image = pixRead("single-line.png");
58
api.SetImage(image);
59
BytePointer text = api.GetUTF8Text();
60
System.out.println("Text: " + text.getString());
61
62
text.deallocate();
63
pixDestroy(image);
64
api.End();
65
```
66
67
### OCR Engine Modes
68
69
Select the OCR engine and neural network configuration.
70
71
```java { .api }
72
// OCR Engine Mode Constants
73
public static final int OEM_TESSERACT_ONLY = 0; // Legacy Tesseract only (deprecated)
74
public static final int OEM_LSTM_ONLY = 1; // LSTM neural network only (recommended)
75
public static final int OEM_TESSERACT_LSTM_COMBINED = 2; // Combined legacy + LSTM (deprecated)
76
public static final int OEM_DEFAULT = 3; // Default (auto-detect best available)
77
78
/**
79
* Initialize with specific OCR engine mode
80
* @param datapath Path to tessdata directory
81
* @param language Language code
82
* @param oem OCR Engine Mode
83
* @return 0 on success, -1 on failure
84
*/
85
public int Init(String datapath, String language, int oem);
86
```
87
88
**Engine Mode Selection Example:**
89
90
```java
91
// Use LSTM-only engine for best accuracy (recommended)
92
if (api.Init(null, "eng", OEM_LSTM_ONLY) != 0) {
93
System.err.println("Could not initialize with LSTM engine");
94
}
95
96
// Use default engine (auto-detect)
97
if (api.Init(null, "eng", OEM_DEFAULT) != 0) {
98
System.err.println("Could not initialize with default engine");
99
}
100
101
// Check available engines programmatically
102
// (engine availability depends on installed tessdata files)
103
```
104
105
### Page Iterator Level Constants
106
107
Control iterator navigation granularity for result analysis.
108
109
```java { .api }
110
// Page Iterator Level Constants
111
public static final int RIL_BLOCK = 0; // Block of text/image/separator line
112
public static final int RIL_PARA = 1; // Paragraph within a block
113
public static final int RIL_TEXTLINE = 2; // Line within a paragraph
114
public static final int RIL_WORD = 3; // Word within a textline
115
public static final int RIL_SYMBOL = 4; // Symbol/character within a word
116
```
117
118
### Block Type Constants
119
120
Identify different types of page layout elements during analysis.
121
122
```java { .api }
123
// Block Type Constants (PolyBlockType)
124
public static final int PT_UNKNOWN = 0; // Type is not yet known
125
public static final int PT_FLOWING_TEXT = 1; // Text that lives inside a column
126
public static final int PT_HEADING_TEXT = 2; // Text that spans more than one column
127
public static final int PT_PULLOUT_TEXT = 3; // Text in a cross-column pull-out region
128
public static final int PT_EQUATION = 4; // Partition belonging to an equation region
129
public static final int PT_INLINE_EQUATION = 5; // Partition has inline equation
130
public static final int PT_TABLE = 6; // Partition belonging to a table region
131
public static final int PT_VERTICAL_TEXT = 7; // Text-line runs vertically
132
public static final int PT_CAPTION_TEXT = 8; // Text that belongs to an image
133
public static final int PT_FLOWING_IMAGE = 9; // Image that lives inside a column
134
public static final int PT_HEADING_IMAGE = 10; // Image that spans more than one column
135
public static final int PT_PULLOUT_IMAGE = 11; // Image in a cross-column pull-out region
136
public static final int PT_HORZ_LINE = 12; // Horizontal Line
137
public static final int PT_VERT_LINE = 13; // Vertical Line
138
public static final int PT_NOISE = 14; // Lies outside of any column
139
public static final int PT_COUNT = 15; // Total number of block types
140
```
141
142
### Orientation and Direction Constants
143
144
Document orientation, writing direction, and text line ordering.
145
146
```java { .api }
147
// Page Orientation Constants
148
public static final int ORIENTATION_PAGE_UP = 0; // Normal upright page
149
public static final int ORIENTATION_PAGE_RIGHT = 1; // Page rotated 90° clockwise
150
public static final int ORIENTATION_PAGE_DOWN = 2; // Page rotated 180°
151
public static final int ORIENTATION_PAGE_LEFT = 3; // Page rotated 90° counter-clockwise
152
153
// Writing Direction Constants
154
public static final int WRITING_DIRECTION_LEFT_TO_RIGHT = 0; // Left-to-right text (Latin, etc.)
155
public static final int WRITING_DIRECTION_RIGHT_TO_LEFT = 1; // Right-to-left text (Arabic, Hebrew)
156
public static final int WRITING_DIRECTION_TOP_TO_BOTTOM = 2; // Top-to-bottom text (Chinese, etc.)
157
158
// Text Line Order Constants
159
public static final int TEXTLINE_ORDER_LEFT_TO_RIGHT = 0; // Lines ordered left-to-right
160
public static final int TEXTLINE_ORDER_RIGHT_TO_LEFT = 1; // Lines ordered right-to-left
161
public static final int TEXTLINE_ORDER_TOP_TO_BOTTOM = 2; // Lines ordered top-to-bottom
162
163
// Text Justification Constants
164
public static final int JUSTIFICATION_UNKNOWN = 0; // Justification not determined
165
public static final int JUSTIFICATION_LEFT = 1; // Left-justified text
166
public static final int JUSTIFICATION_CENTER = 2; // Center-justified text
167
public static final int JUSTIFICATION_RIGHT = 3; // Right-justified text
168
169
// Script Direction Constants
170
public static final int DIR_NEUTRAL = 0; // Text contains only neutral characters
171
public static final int DIR_LEFT_TO_RIGHT = 1; // No right-to-left characters
172
public static final int DIR_RIGHT_TO_LEFT = 2; // No left-to-right characters
173
public static final int DIR_MIX = 3; // Mixed left-to-right and right-to-left
174
```
175
176
### Variable Configuration
177
178
Fine-tune OCR behavior using Tesseract's extensive variable system.
179
180
```java { .api }
181
/**
182
* Set configuration variable
183
* @param name Variable name
184
* @param value Variable value as string
185
* @return true if variable was set successfully
186
*/
187
public boolean SetVariable(String name, String value);
188
189
/**
190
* Set debug-specific variable
191
* @param name Debug variable name
192
* @param value Variable value as string
193
* @return true if variable was set successfully
194
*/
195
public boolean SetDebugVariable(String name, String value);
196
197
/**
198
* Get integer variable value
199
* @param name Variable name
200
* @param value Output: variable value
201
* @return true if variable exists
202
*/
203
public boolean GetIntVariable(String name, IntPointer value);
204
205
/**
206
* Get boolean variable value
207
* @param name Variable name
208
* @param value Output: variable value
209
* @return true if variable exists
210
*/
211
public boolean GetBoolVariable(String name, BoolPointer value);
212
213
/**
214
* Get double variable value
215
* @param name Variable name
216
* @param value Output: variable value
217
* @return true if variable exists
218
*/
219
public boolean GetDoubleVariable(String name, DoublePointer value);
220
221
/**
222
* Get string variable value
223
* @param name Variable name
224
* @return Variable value or null if not found
225
*/
226
public String GetStringVariable(String name);
227
```
228
229
**Variable Configuration Examples:**
230
231
```java
232
TessBaseAPI api = new TessBaseAPI();
233
api.Init(null, "eng");
234
235
// Character blacklist (ignore specific characters)
236
api.SetVariable("tessedit_char_blacklist", "xyz");
237
238
// Character whitelist (only recognize specific characters)
239
api.SetVariable("tessedit_char_whitelist", "0123456789");
240
241
// Numeric-only mode
242
api.SetVariable("classify_bln_numeric_mode", "1");
243
244
// Minimum word confidence threshold
245
api.SetVariable("tessedit_reject_mode", "2");
246
247
// Preserve spaces in output
248
api.SetVariable("preserve_interword_spaces", "1");
249
250
// Enable/disable dictionary checking
251
api.SetVariable("load_system_dawg", "0"); // Disable system dictionary
252
api.SetVariable("load_freq_dawg", "0"); // Disable frequency dictionary
253
api.SetVariable("load_unambig_dawg", "0"); // Disable unambiguous dictionary
254
255
// Performance tuning
256
api.SetVariable("tessedit_pageseg_mode", "6"); // Alternative to SetPageSegMode()
257
api.SetVariable("textord_min_linesize", "2.5"); // Minimum line size
258
259
// Debug output
260
api.SetDebugVariable("tessedit_write_images", "1"); // Save debug images
261
api.SetDebugVariable("textord_debug_tabfind", "1"); // Debug table finding
262
263
// Process image with custom configuration
264
PIX image = pixRead("numbers-only.png");
265
api.SetImage(image);
266
BytePointer text = api.GetUTF8Text();
267
System.out.println("Numbers: " + text.getString());
268
269
// Check current variable values
270
IntPointer numericMode = new IntPointer(1);
271
if (api.GetIntVariable("classify_bln_numeric_mode", numericMode)) {
272
System.out.println("Numeric mode: " + numericMode.get());
273
}
274
275
text.deallocate();
276
pixDestroy(image);
277
api.End();
278
```
279
280
### Language Management
281
282
Multi-language support and language detection configuration.
283
284
```java { .api }
285
/**
286
* Get initialized languages as string
287
* @return Comma-separated list of initialized languages
288
*/
289
public String GetInitLanguagesAsString();
290
291
/**
292
* Get loaded languages into vector
293
* @param langs Output vector to populate with language codes
294
*/
295
public void GetLoadedLanguagesAsVector(StringVector langs);
296
297
/**
298
* Get available languages into vector
299
* @param langs Output vector to populate with available language codes
300
*/
301
public void GetAvailableLanguagesAsVector(StringVector langs);
302
```
303
304
**Multi-language Examples:**
305
306
```java
307
// Initialize with multiple languages
308
TessBaseAPI api = new TessBaseAPI();
309
if (api.Init(null, "eng+fra+deu") != 0) { // English + French + German
310
System.err.println("Could not initialize multi-language");
311
}
312
313
// Check what languages are loaded
314
String loadedLangs = api.GetInitLanguagesAsString();
315
System.out.println("Loaded languages: " + loadedLangs);
316
317
// Get available languages
318
StringVector availableLangs = new StringVector();
319
api.GetAvailableLanguagesAsVector(availableLangs);
320
System.out.println("Available languages:");
321
for (int i = 0; i < availableLangs.size(); i++) {
322
System.out.println(" " + availableLangs.get(i).getString());
323
}
324
325
// Process multilingual document
326
PIX image = pixRead("multilingual-doc.png");
327
api.SetImage(image);
328
BytePointer text = api.GetUTF8Text();
329
System.out.println("Multilingual text: " + text.getString());
330
331
text.deallocate();
332
pixDestroy(image);
333
api.End();
334
```
335
336
### Testing and Utility Functions
337
338
Helper functions for testing configuration modes and engine capabilities.
339
340
```java { .api }
341
// PSM Testing Functions
342
public static boolean PSM_OSD_ENABLED(int pageseg_mode); // Test if OSD enabled
343
public static boolean PSM_ORIENTATION_ENABLED(int mode); // Test if orientation detection enabled
344
public static boolean PSM_COL_FIND_ENABLED(int mode); // Test if column finding enabled
345
public static boolean PSM_SPARSE(int mode); // Test if sparse mode
346
public static boolean PSM_BLOCK_FIND_ENABLED(int mode); // Test if block finding enabled
347
public static boolean PSM_LINE_FIND_ENABLED(int mode); // Test if line finding enabled
348
public static boolean PSM_WORD_FIND_ENABLED(int mode); // Test if word finding enabled
349
350
// PolyBlock Type Testing Functions
351
public static boolean PTIsLineType(int type); // Test if PolyBlockType is line
352
public static boolean PTIsImageType(int type); // Test if PolyBlockType is image
353
public static boolean PTIsTextType(int type); // Test if PolyBlockType is text
354
public static boolean PTIsPulloutType(int type); // Test if PolyBlockType is pullout
355
```
356
357
**Configuration Testing Examples:**
358
359
```java
360
import static org.bytedeco.tesseract.global.tesseract.*;
361
362
// Test if a PSM mode supports specific features
363
int psm = PSM_AUTO;
364
365
if (PSM_OSD_ENABLED(psm)) {
366
System.out.println("Orientation and script detection enabled");
367
}
368
369
if (PSM_WORD_FIND_ENABLED(psm)) {
370
System.out.println("Word-level analysis enabled");
371
}
372
373
if (PSM_SPARSE(psm)) {
374
System.out.println("Sparse text mode enabled");
375
}
376
377
// Test block types during iteration
378
PageIterator pi = api.AnalyseLayout();
379
pi.Begin();
380
do {
381
int blockType = pi.BlockType();
382
383
if (PTIsTextType(blockType)) {
384
System.out.println("Found text block");
385
} else if (PTIsImageType(blockType)) {
386
System.out.println("Found image block");
387
} else if (PTIsLineType(blockType)) {
388
System.out.println("Found line block");
389
}
390
} while (pi.Next(RIL_BLOCK));
391
```
392
393
## Common Configuration Patterns
394
395
### Forms and Structured Documents
396
397
```java
398
// Optimize for form processing
399
api.SetPageSegMode(PSM_SINGLE_BLOCK);
400
api.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789");
401
api.SetVariable("preserve_interword_spaces", "1");
402
```
403
404
### Numbers and Codes
405
406
```java
407
// Optimize for numeric content
408
api.SetPageSegMode(PSM_SINGLE_LINE);
409
api.SetVariable("classify_bln_numeric_mode", "1");
410
api.SetVariable("tessedit_char_whitelist", "0123456789.-");
411
```
412
413
### Poor Quality Images
414
415
```java
416
// Settings for low-quality scans
417
api.SetVariable("tessedit_reject_mode", "0"); // Don't reject low-confidence words
418
api.SetVariable("textord_min_linesize", "1.0"); // Accept smaller text
419
api.SetVariable("edges_max_children_per_outline", "50"); // More edge detection
420
```
421
422
### High Accuracy Mode
423
424
```java
425
// Maximum accuracy (slower processing)
426
api.SetPageSegMode(PSM_AUTO_OSD); // Full layout analysis with orientation detection
427
api.SetVariable("tessedit_enable_dict_correction", "1"); // Dictionary correction
428
api.SetVariable("classify_enable_learning", "1"); // Enable learning
429
api.SetVariable("classify_enable_adaptive_matcher", "1"); // Adaptive matching
430
```
431
432
### Performance Optimization
433
434
```java
435
// Faster processing (lower accuracy)
436
api.SetPageSegMode(PSM_SINGLE_BLOCK);
437
api.SetVariable("load_system_dawg", "0"); // Skip dictionary loading
438
api.SetVariable("load_freq_dawg", "0");
439
api.SetVariable("tessedit_enable_dict_correction", "0");
440
api.SetVariable("classify_enable_learning", "0");
441
```
442
443
## Advanced Configuration Variables
444
445
### Text Detection and Layout
446
- `textord_min_linesize` - Minimum line size threshold
447
- `textord_max_noise_size` - Maximum noise blob size
448
- `edges_max_children_per_outline` - Edge detection sensitivity
449
- `textord_debug_tabfind` - Table detection debugging
450
451
### Character Recognition
452
- `classify_bln_numeric_mode` - Numeric-only recognition mode
453
- `classify_enable_learning` - Enable adaptive learning
454
- `classify_enable_adaptive_matcher` - Use adaptive matching
455
- `tessedit_enable_dict_correction` - Dictionary-based correction
456
457
### Output Control
458
- `tessedit_char_blacklist` - Characters to ignore
459
- `tessedit_char_whitelist` - Only recognize these characters
460
- `preserve_interword_spaces` - Maintain spacing in output
461
- `tessedit_reject_mode` - Word rejection strategy
462
463
### Debug and Development
464
- `tessedit_write_images` - Save intermediate processing images
465
- `tessedit_dump_pageseg_images` - Save page segmentation debug images
466
- `classify_debug_level` - Character classification debug level