or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

configuration.mdcore-ocr-engine.mdindex.mdlanguage-support.mdlayout-analysis.mdoutput-renderers.mdresult-navigation.md

core-ocr-engine.mddocs/

0

# Core OCR Engine

1

2

The TessBaseAPI class provides the primary interface for optical character recognition operations. It handles engine initialization, image processing, text recognition, and result extraction with comprehensive configuration options.

3

4

## Capabilities

5

6

### Engine Initialization

7

8

Set up the Tesseract OCR engine with language models and configuration parameters.

9

10

```java { .api }

11

public class TessBaseAPI {

12

// Constructor

13

public TessBaseAPI();

14

15

// Version Information

16

public static String Version();

17

18

// Initialization Methods

19

public int Init(String datapath, String language, int oem);

20

public int Init(String datapath, String language);

21

public void InitForAnalysePage();

22

23

// Cleanup

24

public void End();

25

}

26

```

27

28

**Init Parameters:**

29

- `datapath`: Path to tessdata directory (null for default location)

30

- `language`: ISO 639-3 language code (e.g., "eng", "fra", "deu")

31

- `oem`: OCR Engine Mode (OEM_LSTM_ONLY recommended)

32

33

**Return Values:**

34

- `0`: Success

35

- `-1`: Initialization failed

36

37

#### Usage Example

38

39

```java

40

TessBaseAPI api = new TessBaseAPI();

41

42

// Initialize with English language and LSTM engine

43

int result = api.Init(null, "eng", OEM_LSTM_ONLY);

44

if (result != 0) {

45

System.err.println("Tesseract initialization failed");

46

return;

47

}

48

49

// Use API for OCR operations...

50

51

// Always cleanup when done

52

api.End();

53

```

54

55

### Image Input Methods

56

57

Provide images to the OCR engine from various sources and formats.

58

59

```java { .api }

60

public class TessBaseAPI {

61

// Set image from Leptonica PIX object

62

public void SetImage(PIX pix);

63

64

// Set image from raw byte array

65

public void SetImage(byte[] imagedata, int width, int height,

66

int bytes_per_pixel, int bytes_per_line);

67

68

// Set rectangular region of interest

69

public void SetRectangle(int left, int top, int width, int height);

70

71

// Input image management

72

public void SetInputImage(PIX pix);

73

public PIX GetInputImage();

74

public void SetInputName(String name);

75

public String GetInputName();

76

77

// Output configuration

78

public void SetOutputName(String name);

79

80

// Resolution metadata

81

public void SetSourceResolution(int ppi);

82

public int GetSourceYResolution();

83

}

84

```

85

86

**Image Format Support:**

87

- **bytes_per_pixel**: 1 (grayscale), 3 (RGB), 4 (RGBA)

88

- **bytes_per_line**: Row stride including padding

89

- **Supported formats**: PNG, JPEG, TIFF, BMP, GIF (via Leptonica)

90

91

#### Usage Example

92

93

```java

94

// Method 1: Using Leptonica (recommended)

95

PIX image = pixRead("/path/to/image.png");

96

api.SetImage(image);

97

98

// Method 2: Using raw byte data

99

byte[] imageData = loadImageBytes();

100

api.SetImage(imageData, width, height, 3, width * 3);

101

102

// Method 3: Process only part of the image

103

api.SetImage(image);

104

api.SetRectangle(100, 50, 300, 200); // x, y, width, height

105

```

106

107

### Text Recognition

108

109

Perform OCR recognition and extract text results in various formats.

110

111

```java { .api }

112

public class TessBaseAPI {

113

// Full recognition process

114

public int Recognize(ETEXT_DESC monitor);

115

116

// Simple rectangle OCR

117

public String TesseractRect(byte[] imagedata, int bytes_per_pixel,

118

int bytes_per_line, int left, int top,

119

int width, int height);

120

121

// Text extraction methods

122

public String GetUTF8Text();

123

public String GetHOCRText(int page_number);

124

public String GetAltoText(int page_number);

125

public String GetTSVText(int page_number);

126

public String GetBoxText(int page_number);

127

public String GetUNLVText();

128

}

129

```

130

131

**Output Formats:**

132

- **UTF8**: Plain text with line breaks

133

- **hOCR**: HTML with word coordinates and confidence

134

- **ALTO**: XML document structure standard

135

- **TSV**: Tab-separated values with coordinates

136

- **Box**: Character coordinates for training

137

138

#### Usage Example

139

140

```java

141

// Basic text extraction

142

api.SetImage(image);

143

String text = api.GetUTF8Text();

144

System.out.println("Extracted text: " + text);

145

146

// Advanced recognition with monitoring

147

ETEXT_DESC monitor = new ETEXT_DESC();

148

monitor.set_deadline_msecs(10000); // 10 second timeout

149

150

int result = api.Recognize(monitor);

151

if (result == 0) {

152

String text = api.GetUTF8Text();

153

String hocr = api.GetHOCRText(0);

154

}

155

156

// Simple one-call OCR for rectangular region

157

String rectText = api.TesseractRect(imageBytes, 3, width * 3,

158

100, 50, 300, 200);

159

```

160

161

### Confidence and Quality Metrics

162

163

Access recognition confidence scores and quality metrics.

164

165

```java { .api }

166

public class TessBaseAPI {

167

// Overall confidence

168

public int MeanTextConf();

169

170

// Word-level confidence scores

171

public int[] AllWordConfidences();

172

}

173

```

174

175

**Confidence Values:**

176

- **Range**: 0-100 (higher values indicate better confidence)

177

- **Interpretation**:

178

- 90-100: Excellent recognition

179

- 70-89: Good recognition

180

- 50-69: Fair recognition

181

- 0-49: Poor recognition

182

183

#### Usage Example

184

185

```java

186

api.SetImage(image);

187

BytePointer textPtr = api.GetUTF8Text();

188

String text = textPtr.getString();

189

textPtr.deallocate();

190

191

// Check overall confidence

192

int meanConf = api.MeanTextConf();

193

System.out.println("Average confidence: " + meanConf + "%");

194

195

// Get per-word confidence scores

196

int[] wordConfidences = api.AllWordConfidences();

197

for (int i = 0; i < wordConfidences.length; i++) {

198

System.out.println("Word " + i + " confidence: " + wordConfidences[i] + "%");

199

}

200

```

201

202

### Image Processing

203

204

Access processed images and thresholding results.

205

206

```java { .api }

207

public class TessBaseAPI {

208

// Get processed binary image

209

public PIX GetThresholdedImage();

210

211

// Datapath information

212

public String GetDatapath();

213

}

214

```

215

216

#### Usage Example

217

218

```java

219

api.SetImage(originalImage);

220

221

// Get the binary/thresholded image used for OCR

222

PIX thresholded = api.GetThresholdedImage();

223

pixWrite("/tmp/thresholded.png", thresholded, IFF_PNG);

224

225

// Cleanup

226

pixDestroy(thresholded);

227

```

228

229

### Batch Processing

230

231

Process multiple pages or documents efficiently.

232

233

```java { .api }

234

public class TessBaseAPI {

235

// Process multiple pages with renderer pipeline

236

public boolean ProcessPages(String filename, String retry_config,

237

int timeout_millisec, TessResultRenderer renderer);

238

239

// Process single page with renderer

240

public boolean ProcessPage(PIX pix, int page_index, String filename,

241

String retry_config, int timeout_millisec,

242

TessResultRenderer renderer);

243

244

// Clear previous results

245

public void Clear();

246

}

247

```

248

249

#### Usage Example

250

251

```java

252

// Setup renderer chain for multiple output formats

253

TessResultRenderer textRenderer = TessTextRendererCreate("output");

254

TessResultRenderer pdfRenderer = TessPDFRendererCreate("output", "/usr/share/tessdata", false);

255

textRenderer.insert(pdfRenderer);

256

257

// Process multi-page document

258

boolean success = api.ProcessPages("document.pdf", null, 60000, textRenderer);

259

260

if (success) {

261

System.out.println("Document processed successfully");

262

// Output files: output.txt, output.pdf

263

}

264

265

// Cleanup renderers

266

TessDeleteResultRenderer(textRenderer);

267

```

268

269

## Error Handling

270

271

### Common Error Conditions

272

273

- **Initialization Failure**: Invalid tessdata path or missing language files

274

- **Image Loading**: Unsupported format or corrupted image data

275

- **Memory Issues**: Large images or insufficient system memory

276

- **Timeout**: Recognition takes longer than specified deadline

277

278

### Best Practices

279

280

```java

281

public class RobustOCR {

282

public static String extractText(String imagePath) {

283

TessBaseAPI api = new TessBaseAPI();

284

PIX image = null;

285

String result = null;

286

287

try {

288

// Initialize with error checking

289

if (api.Init(null, "eng") != 0) {

290

throw new RuntimeException("Tesseract initialization failed");

291

}

292

293

// Load image with validation

294

image = pixRead(imagePath);

295

if (image == null) {

296

throw new RuntimeException("Failed to load image: " + imagePath);

297

}

298

299

// Set image and extract text

300

api.SetImage(image);

301

result = api.GetUTF8Text();

302

303

} finally {

304

// Always cleanup resources

305

if (image != null) {

306

pixDestroy(image);

307

}

308

api.End();

309

}

310

311

return result;

312

}

313

}

314

```

315

316

## Types

317

318

### Progress Monitoring

319

320

```java { .api }

321

public class ETEXT_DESC {

322

public short progress(); // Progress 0-100

323

public boolean more_to_come(); // More work pending

324

public boolean ocr_alive(); // Engine is active

325

public byte err_code(); // Error code if failed

326

public void set_deadline_msecs(int deadline_msecs);

327

public boolean deadline_exceeded();

328

}

329

```

330

331

### Version Information

332

333

```java { .api }

334

// Tesseract version constants

335

public static final int TESSERACT_MAJOR_VERSION = 5;

336

public static final int TESSERACT_MINOR_VERSION = 5;

337

public static final int TESSERACT_MICRO_VERSION = 1;

338

public static final String TESSERACT_VERSION_STR = "5.5.1";

339

```