or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

basic-ocr.mdconfiguration.mddata-structures.mdindex.mditerators.mdrenderers.md

configuration.mddocs/

0

# Configuration and Parameters

1

2

Extensive configuration system providing fine-grained control over OCR behavior including page segmentation modes, OCR engine modes, variable settings, language management, and performance tuning options.

3

4

## Capabilities

5

6

### Page Segmentation Modes

7

8

Control how Tesseract analyzes page layout and identifies text regions.

9

10

```java { .api }

11

// Page Segmentation Mode Constants

12

public static final int PSM_OSD_ONLY = 0; // Orientation and script detection only

13

public static final int PSM_AUTO_OSD = 1; // Automatic page segmentation with OSD

14

public static final int PSM_AUTO_ONLY = 2; // Automatic page segmentation, no OSD

15

public static final int PSM_AUTO = 3; // Fully automatic page segmentation (default)

16

public static final int PSM_SINGLE_COLUMN = 4; // Single column of text

17

public static final int PSM_SINGLE_BLOCK_VERT_TEXT = 5; // Single uniform block of vertical text

18

public static final int PSM_SINGLE_BLOCK = 6; // Single uniform block of text

19

public static final int PSM_SINGLE_LINE = 7; // Single text line

20

public static final int PSM_SINGLE_WORD = 8; // Single word

21

public static final int PSM_CIRCLE_WORD = 9; // Single word in a circle

22

public static final int PSM_SINGLE_CHAR = 10; // Single character

23

public static final int PSM_SPARSE_TEXT = 11; // Sparse text in no particular order

24

public static final int PSM_SPARSE_TEXT_OSD = 12; // Sparse text with OSD

25

public static final int PSM_RAW_LINE = 13; // Raw line, bypass word detection

26

27

/**

28

* Set page segmentation mode

29

* @param mode PSM constant (PSM_AUTO, PSM_SINGLE_BLOCK, etc.)

30

*/

31

public void SetPageSegMode(int mode);

32

33

/**

34

* Get current page segmentation mode

35

* @return Current PSM mode

36

*/

37

public int GetPageSegMode();

38

```

39

40

**Page Segmentation Example:**

41

42

```java

43

import static org.bytedeco.tesseract.global.tesseract.*;

44

45

TessBaseAPI api = new TessBaseAPI();

46

api.Init(null, "eng");

47

48

// Configure for single line of text (faster, more accurate for simple cases)

49

api.SetPageSegMode(PSM_SINGLE_LINE);

50

51

// Configure for automatic layout detection (good for complex documents)

52

api.SetPageSegMode(PSM_AUTO);

53

54

// Configure for single word (useful for form fields)

55

api.SetPageSegMode(PSM_SINGLE_WORD);

56

57

PIX image = pixRead("single-line.png");

58

api.SetImage(image);

59

BytePointer text = api.GetUTF8Text();

60

System.out.println("Text: " + text.getString());

61

62

text.deallocate();

63

pixDestroy(image);

64

api.End();

65

```

66

67

### OCR Engine Modes

68

69

Select the OCR engine and neural network configuration.

70

71

```java { .api }

72

// OCR Engine Mode Constants

73

public static final int OEM_TESSERACT_ONLY = 0; // Legacy Tesseract only (deprecated)

74

public static final int OEM_LSTM_ONLY = 1; // LSTM neural network only (recommended)

75

public static final int OEM_TESSERACT_LSTM_COMBINED = 2; // Combined legacy + LSTM (deprecated)

76

public static final int OEM_DEFAULT = 3; // Default (auto-detect best available)

77

78

/**

79

* Initialize with specific OCR engine mode

80

* @param datapath Path to tessdata directory

81

* @param language Language code

82

* @param oem OCR Engine Mode

83

* @return 0 on success, -1 on failure

84

*/

85

public int Init(String datapath, String language, int oem);

86

```

87

88

**Engine Mode Selection Example:**

89

90

```java

91

// Use LSTM-only engine for best accuracy (recommended)

92

if (api.Init(null, "eng", OEM_LSTM_ONLY) != 0) {

93

System.err.println("Could not initialize with LSTM engine");

94

}

95

96

// Use default engine (auto-detect)

97

if (api.Init(null, "eng", OEM_DEFAULT) != 0) {

98

System.err.println("Could not initialize with default engine");

99

}

100

101

// Check available engines programmatically

102

// (engine availability depends on installed tessdata files)

103

```

104

105

### Page Iterator Level Constants

106

107

Control iterator navigation granularity for result analysis.

108

109

```java { .api }

110

// Page Iterator Level Constants

111

public static final int RIL_BLOCK = 0; // Block of text/image/separator line

112

public static final int RIL_PARA = 1; // Paragraph within a block

113

public static final int RIL_TEXTLINE = 2; // Line within a paragraph

114

public static final int RIL_WORD = 3; // Word within a textline

115

public static final int RIL_SYMBOL = 4; // Symbol/character within a word

116

```

117

118

### Block Type Constants

119

120

Identify different types of page layout elements during analysis.

121

122

```java { .api }

123

// Block Type Constants (PolyBlockType)

124

public static final int PT_UNKNOWN = 0; // Type is not yet known

125

public static final int PT_FLOWING_TEXT = 1; // Text that lives inside a column

126

public static final int PT_HEADING_TEXT = 2; // Text that spans more than one column

127

public static final int PT_PULLOUT_TEXT = 3; // Text in a cross-column pull-out region

128

public static final int PT_EQUATION = 4; // Partition belonging to an equation region

129

public static final int PT_INLINE_EQUATION = 5; // Partition has inline equation

130

public static final int PT_TABLE = 6; // Partition belonging to a table region

131

public static final int PT_VERTICAL_TEXT = 7; // Text-line runs vertically

132

public static final int PT_CAPTION_TEXT = 8; // Text that belongs to an image

133

public static final int PT_FLOWING_IMAGE = 9; // Image that lives inside a column

134

public static final int PT_HEADING_IMAGE = 10; // Image that spans more than one column

135

public static final int PT_PULLOUT_IMAGE = 11; // Image in a cross-column pull-out region

136

public static final int PT_HORZ_LINE = 12; // Horizontal Line

137

public static final int PT_VERT_LINE = 13; // Vertical Line

138

public static final int PT_NOISE = 14; // Lies outside of any column

139

public static final int PT_COUNT = 15; // Total number of block types

140

```

141

142

### Orientation and Direction Constants

143

144

Document orientation, writing direction, and text line ordering.

145

146

```java { .api }

147

// Page Orientation Constants

148

public static final int ORIENTATION_PAGE_UP = 0; // Normal upright page

149

public static final int ORIENTATION_PAGE_RIGHT = 1; // Page rotated 90° clockwise

150

public static final int ORIENTATION_PAGE_DOWN = 2; // Page rotated 180°

151

public static final int ORIENTATION_PAGE_LEFT = 3; // Page rotated 90° counter-clockwise

152

153

// Writing Direction Constants

154

public static final int WRITING_DIRECTION_LEFT_TO_RIGHT = 0; // Left-to-right text (Latin, etc.)

155

public static final int WRITING_DIRECTION_RIGHT_TO_LEFT = 1; // Right-to-left text (Arabic, Hebrew)

156

public static final int WRITING_DIRECTION_TOP_TO_BOTTOM = 2; // Top-to-bottom text (Chinese, etc.)

157

158

// Text Line Order Constants

159

public static final int TEXTLINE_ORDER_LEFT_TO_RIGHT = 0; // Lines ordered left-to-right

160

public static final int TEXTLINE_ORDER_RIGHT_TO_LEFT = 1; // Lines ordered right-to-left

161

public static final int TEXTLINE_ORDER_TOP_TO_BOTTOM = 2; // Lines ordered top-to-bottom

162

163

// Text Justification Constants

164

public static final int JUSTIFICATION_UNKNOWN = 0; // Justification not determined

165

public static final int JUSTIFICATION_LEFT = 1; // Left-justified text

166

public static final int JUSTIFICATION_CENTER = 2; // Center-justified text

167

public static final int JUSTIFICATION_RIGHT = 3; // Right-justified text

168

169

// Script Direction Constants

170

public static final int DIR_NEUTRAL = 0; // Text contains only neutral characters

171

public static final int DIR_LEFT_TO_RIGHT = 1; // No right-to-left characters

172

public static final int DIR_RIGHT_TO_LEFT = 2; // No left-to-right characters

173

public static final int DIR_MIX = 3; // Mixed left-to-right and right-to-left

174

```

175

176

### Variable Configuration

177

178

Fine-tune OCR behavior using Tesseract's extensive variable system.

179

180

```java { .api }

181

/**

182

* Set configuration variable

183

* @param name Variable name

184

* @param value Variable value as string

185

* @return true if variable was set successfully

186

*/

187

public boolean SetVariable(String name, String value);

188

189

/**

190

* Set debug-specific variable

191

* @param name Debug variable name

192

* @param value Variable value as string

193

* @return true if variable was set successfully

194

*/

195

public boolean SetDebugVariable(String name, String value);

196

197

/**

198

* Get integer variable value

199

* @param name Variable name

200

* @param value Output: variable value

201

* @return true if variable exists

202

*/

203

public boolean GetIntVariable(String name, IntPointer value);

204

205

/**

206

* Get boolean variable value

207

* @param name Variable name

208

* @param value Output: variable value

209

* @return true if variable exists

210

*/

211

public boolean GetBoolVariable(String name, BoolPointer value);

212

213

/**

214

* Get double variable value

215

* @param name Variable name

216

* @param value Output: variable value

217

* @return true if variable exists

218

*/

219

public boolean GetDoubleVariable(String name, DoublePointer value);

220

221

/**

222

* Get string variable value

223

* @param name Variable name

224

* @return Variable value or null if not found

225

*/

226

public String GetStringVariable(String name);

227

```

228

229

**Variable Configuration Examples:**

230

231

```java

232

TessBaseAPI api = new TessBaseAPI();

233

api.Init(null, "eng");

234

235

// Character blacklist (ignore specific characters)

236

api.SetVariable("tessedit_char_blacklist", "xyz");

237

238

// Character whitelist (only recognize specific characters)

239

api.SetVariable("tessedit_char_whitelist", "0123456789");

240

241

// Numeric-only mode

242

api.SetVariable("classify_bln_numeric_mode", "1");

243

244

// Minimum word confidence threshold

245

api.SetVariable("tessedit_reject_mode", "2");

246

247

// Preserve spaces in output

248

api.SetVariable("preserve_interword_spaces", "1");

249

250

// Enable/disable dictionary checking

251

api.SetVariable("load_system_dawg", "0"); // Disable system dictionary

252

api.SetVariable("load_freq_dawg", "0"); // Disable frequency dictionary

253

api.SetVariable("load_unambig_dawg", "0"); // Disable unambiguous dictionary

254

255

// Performance tuning

256

api.SetVariable("tessedit_pageseg_mode", "6"); // Alternative to SetPageSegMode()

257

api.SetVariable("textord_min_linesize", "2.5"); // Minimum line size

258

259

// Debug output

260

api.SetDebugVariable("tessedit_write_images", "1"); // Save debug images

261

api.SetDebugVariable("textord_debug_tabfind", "1"); // Debug table finding

262

263

// Process image with custom configuration

264

PIX image = pixRead("numbers-only.png");

265

api.SetImage(image);

266

BytePointer text = api.GetUTF8Text();

267

System.out.println("Numbers: " + text.getString());

268

269

// Check current variable values

270

IntPointer numericMode = new IntPointer(1);

271

if (api.GetIntVariable("classify_bln_numeric_mode", numericMode)) {

272

System.out.println("Numeric mode: " + numericMode.get());

273

}

274

275

text.deallocate();

276

pixDestroy(image);

277

api.End();

278

```

279

280

### Language Management

281

282

Multi-language support and language detection configuration.

283

284

```java { .api }

285

/**

286

* Get initialized languages as string

287

* @return Comma-separated list of initialized languages

288

*/

289

public String GetInitLanguagesAsString();

290

291

/**

292

* Get loaded languages into vector

293

* @param langs Output vector to populate with language codes

294

*/

295

public void GetLoadedLanguagesAsVector(StringVector langs);

296

297

/**

298

* Get available languages into vector

299

* @param langs Output vector to populate with available language codes

300

*/

301

public void GetAvailableLanguagesAsVector(StringVector langs);

302

```

303

304

**Multi-language Examples:**

305

306

```java

307

// Initialize with multiple languages

308

TessBaseAPI api = new TessBaseAPI();

309

if (api.Init(null, "eng+fra+deu") != 0) { // English + French + German

310

System.err.println("Could not initialize multi-language");

311

}

312

313

// Check what languages are loaded

314

String loadedLangs = api.GetInitLanguagesAsString();

315

System.out.println("Loaded languages: " + loadedLangs);

316

317

// Get available languages

318

StringVector availableLangs = new StringVector();

319

api.GetAvailableLanguagesAsVector(availableLangs);

320

System.out.println("Available languages:");

321

for (int i = 0; i < availableLangs.size(); i++) {

322

System.out.println(" " + availableLangs.get(i).getString());

323

}

324

325

// Process multilingual document

326

PIX image = pixRead("multilingual-doc.png");

327

api.SetImage(image);

328

BytePointer text = api.GetUTF8Text();

329

System.out.println("Multilingual text: " + text.getString());

330

331

text.deallocate();

332

pixDestroy(image);

333

api.End();

334

```

335

336

### Testing and Utility Functions

337

338

Helper functions for testing configuration modes and engine capabilities.

339

340

```java { .api }

341

// PSM Testing Functions

342

public static boolean PSM_OSD_ENABLED(int pageseg_mode); // Test if OSD enabled

343

public static boolean PSM_ORIENTATION_ENABLED(int mode); // Test if orientation detection enabled

344

public static boolean PSM_COL_FIND_ENABLED(int mode); // Test if column finding enabled

345

public static boolean PSM_SPARSE(int mode); // Test if sparse mode

346

public static boolean PSM_BLOCK_FIND_ENABLED(int mode); // Test if block finding enabled

347

public static boolean PSM_LINE_FIND_ENABLED(int mode); // Test if line finding enabled

348

public static boolean PSM_WORD_FIND_ENABLED(int mode); // Test if word finding enabled

349

350

// PolyBlock Type Testing Functions

351

public static boolean PTIsLineType(int type); // Test if PolyBlockType is line

352

public static boolean PTIsImageType(int type); // Test if PolyBlockType is image

353

public static boolean PTIsTextType(int type); // Test if PolyBlockType is text

354

public static boolean PTIsPulloutType(int type); // Test if PolyBlockType is pullout

355

```

356

357

**Configuration Testing Examples:**

358

359

```java

360

import static org.bytedeco.tesseract.global.tesseract.*;

361

362

// Test if a PSM mode supports specific features

363

int psm = PSM_AUTO;

364

365

if (PSM_OSD_ENABLED(psm)) {

366

System.out.println("Orientation and script detection enabled");

367

}

368

369

if (PSM_WORD_FIND_ENABLED(psm)) {

370

System.out.println("Word-level analysis enabled");

371

}

372

373

if (PSM_SPARSE(psm)) {

374

System.out.println("Sparse text mode enabled");

375

}

376

377

// Test block types during iteration

378

PageIterator pi = api.AnalyseLayout();

379

pi.Begin();

380

do {

381

int blockType = pi.BlockType();

382

383

if (PTIsTextType(blockType)) {

384

System.out.println("Found text block");

385

} else if (PTIsImageType(blockType)) {

386

System.out.println("Found image block");

387

} else if (PTIsLineType(blockType)) {

388

System.out.println("Found line block");

389

}

390

} while (pi.Next(RIL_BLOCK));

391

```

392

393

## Common Configuration Patterns

394

395

### Forms and Structured Documents

396

397

```java

398

// Optimize for form processing

399

api.SetPageSegMode(PSM_SINGLE_BLOCK);

400

api.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789");

401

api.SetVariable("preserve_interword_spaces", "1");

402

```

403

404

### Numbers and Codes

405

406

```java

407

// Optimize for numeric content

408

api.SetPageSegMode(PSM_SINGLE_LINE);

409

api.SetVariable("classify_bln_numeric_mode", "1");

410

api.SetVariable("tessedit_char_whitelist", "0123456789.-");

411

```

412

413

### Poor Quality Images

414

415

```java

416

// Settings for low-quality scans

417

api.SetVariable("tessedit_reject_mode", "0"); // Don't reject low-confidence words

418

api.SetVariable("textord_min_linesize", "1.0"); // Accept smaller text

419

api.SetVariable("edges_max_children_per_outline", "50"); // More edge detection

420

```

421

422

### High Accuracy Mode

423

424

```java

425

// Maximum accuracy (slower processing)

426

api.SetPageSegMode(PSM_AUTO_OSD); // Full layout analysis with orientation detection

427

api.SetVariable("tessedit_enable_dict_correction", "1"); // Dictionary correction

428

api.SetVariable("classify_enable_learning", "1"); // Enable learning

429

api.SetVariable("classify_enable_adaptive_matcher", "1"); // Adaptive matching

430

```

431

432

### Performance Optimization

433

434

```java

435

// Faster processing (lower accuracy)

436

api.SetPageSegMode(PSM_SINGLE_BLOCK);

437

api.SetVariable("load_system_dawg", "0"); // Skip dictionary loading

438

api.SetVariable("load_freq_dawg", "0");

439

api.SetVariable("tessedit_enable_dict_correction", "0");

440

api.SetVariable("classify_enable_learning", "0");

441

```

442

443

## Advanced Configuration Variables

444

445

### Text Detection and Layout

446

- `textord_min_linesize` - Minimum line size threshold

447

- `textord_max_noise_size` - Maximum noise blob size

448

- `edges_max_children_per_outline` - Edge detection sensitivity

449

- `textord_debug_tabfind` - Table detection debugging

450

451

### Character Recognition

452

- `classify_bln_numeric_mode` - Numeric-only recognition mode

453

- `classify_enable_learning` - Enable adaptive learning

454

- `classify_enable_adaptive_matcher` - Use adaptive matching

455

- `tessedit_enable_dict_correction` - Dictionary-based correction

456

457

### Output Control

458

- `tessedit_char_blacklist` - Characters to ignore

459

- `tessedit_char_whitelist` - Only recognize these characters

460

- `preserve_interword_spaces` - Maintain spacing in output

461

- `tessedit_reject_mode` - Word rejection strategy

462

463

### Debug and Development

464

- `tessedit_write_images` - Save intermediate processing images

465

- `tessedit_dump_pageseg_images` - Save page segmentation debug images

466

- `classify_debug_level` - Character classification debug level