Tessl Tile for maven/org.bytedeco/tesseract-platform@5.5.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

configuration.md core-ocr-engine.md index.md language-support.md layout-analysis.md output-renderers.md

core-ocr-engine.mddocs/

0
# Core OCR Engine
1

2
The TessBaseAPI class provides the primary interface for optical character recognition operations. It handles engine initialization, image processing, text recognition, and result extraction with comprehensive configuration options.
3

4
## Capabilities
5

6
### Engine Initialization
7

8
Set up the Tesseract OCR engine with language models and configuration parameters.
9

10
```java { .api }
11
public class TessBaseAPI {
12
    // Constructor
13
    public TessBaseAPI();
14
    
15
    // Version Information
16
    public static String Version();
17
    
18
    // Initialization Methods
19
    public int Init(String datapath, String language, int oem);
20
    public int Init(String datapath, String language);
21
    public void InitForAnalysePage();
22
    
23
    // Cleanup
24
    public void End();
25
}
26
```
27

28
**Init Parameters:**
29
- `datapath`: Path to tessdata directory (null for default location)
30
- `language`: ISO 639-3 language code (e.g., "eng", "fra", "deu") 
31
- `oem`: OCR Engine Mode (OEM_LSTM_ONLY recommended)
32

33
**Return Values:**
34
- `0`: Success
35
- `-1`: Initialization failed
36

37
#### Usage Example
38

39
```java
40
TessBaseAPI api = new TessBaseAPI();
41

42
// Initialize with English language and LSTM engine
43
int result = api.Init(null, "eng", OEM_LSTM_ONLY);
44
if (result != 0) {
45
    System.err.println("Tesseract initialization failed");
46
    return;
47
}
48

49
// Use API for OCR operations...
50

51
// Always cleanup when done
52
api.End();
53
```
54

55
### Image Input Methods
56

57
Provide images to the OCR engine from various sources and formats.
58

59
```java { .api }
60
public class TessBaseAPI {
61
    // Set image from Leptonica PIX object
62
    public void SetImage(PIX pix);
63
    
64
    // Set image from raw byte array
65
    public void SetImage(byte[] imagedata, int width, int height, 
66
                        int bytes_per_pixel, int bytes_per_line);
67
    
68
    // Set rectangular region of interest  
69
    public void SetRectangle(int left, int top, int width, int height);
70
    
71
    // Input image management
72
    public void SetInputImage(PIX pix);
73
    public PIX GetInputImage();
74
    public void SetInputName(String name);
75
    public String GetInputName();
76
    
77
    // Output configuration
78
    public void SetOutputName(String name);
79
    
80
    // Resolution metadata
81
    public void SetSourceResolution(int ppi);
82
    public int GetSourceYResolution();
83
}
84
```
85

86
**Image Format Support:**
87
- **bytes_per_pixel**: 1 (grayscale), 3 (RGB), 4 (RGBA)
88
- **bytes_per_line**: Row stride including padding
89
- **Supported formats**: PNG, JPEG, TIFF, BMP, GIF (via Leptonica)
90

91
#### Usage Example
92

93
```java
94
// Method 1: Using Leptonica (recommended)
95
PIX image = pixRead("/path/to/image.png");
96
api.SetImage(image);
97

98
// Method 2: Using raw byte data
99
byte[] imageData = loadImageBytes();
100
api.SetImage(imageData, width, height, 3, width * 3);
101

102
// Method 3: Process only part of the image
103
api.SetImage(image);
104
api.SetRectangle(100, 50, 300, 200); // x, y, width, height
105
```
106

107
### Text Recognition
108

109
Perform OCR recognition and extract text results in various formats.
110

111
```java { .api }
112
public class TessBaseAPI {
113
    // Full recognition process
114
    public int Recognize(ETEXT_DESC monitor);
115
    
116
    // Simple rectangle OCR
117
    public String TesseractRect(byte[] imagedata, int bytes_per_pixel, 
118
                              int bytes_per_line, int left, int top, 
119
                              int width, int height);
120
    
121
    // Text extraction methods
122
    public String GetUTF8Text();
123
    public String GetHOCRText(int page_number);
124
    public String GetAltoText(int page_number);
125
    public String GetTSVText(int page_number);
126
    public String GetBoxText(int page_number);
127
    public String GetUNLVText();
128
}
129
```
130

131
**Output Formats:**
132
- **UTF8**: Plain text with line breaks
133
- **hOCR**: HTML with word coordinates and confidence
134
- **ALTO**: XML document structure standard
135
- **TSV**: Tab-separated values with coordinates
136
- **Box**: Character coordinates for training
137

138
#### Usage Example
139

140
```java
141
// Basic text extraction
142
api.SetImage(image);
143
String text = api.GetUTF8Text();
144
System.out.println("Extracted text: " + text);
145

146
// Advanced recognition with monitoring
147
ETEXT_DESC monitor = new ETEXT_DESC();
148
monitor.set_deadline_msecs(10000); // 10 second timeout
149

150
int result = api.Recognize(monitor);
151
if (result == 0) {
152
    String text = api.GetUTF8Text();
153
    String hocr = api.GetHOCRText(0);
154
}
155

156
// Simple one-call OCR for rectangular region
157
String rectText = api.TesseractRect(imageBytes, 3, width * 3, 
158
                                   100, 50, 300, 200);
159
```
160

161
### Confidence and Quality Metrics
162

163
Access recognition confidence scores and quality metrics.
164

165
```java { .api }
166
public class TessBaseAPI {
167
    // Overall confidence
168
    public int MeanTextConf();
169
    
170
    // Word-level confidence scores
171
    public int[] AllWordConfidences();
172
}
173
```
174

175
**Confidence Values:**
176
- **Range**: 0-100 (higher values indicate better confidence)
177
- **Interpretation**: 
178
  - 90-100: Excellent recognition
179
  - 70-89: Good recognition  
180
  - 50-69: Fair recognition
181
  - 0-49: Poor recognition
182

183
#### Usage Example
184

185
```java
186
api.SetImage(image);
187
BytePointer textPtr = api.GetUTF8Text();
188
String text = textPtr.getString();
189
textPtr.deallocate();
190

191
// Check overall confidence
192
int meanConf = api.MeanTextConf();
193
System.out.println("Average confidence: " + meanConf + "%");
194

195
// Get per-word confidence scores
196
int[] wordConfidences = api.AllWordConfidences();
197
for (int i = 0; i < wordConfidences.length; i++) {
198
    System.out.println("Word " + i + " confidence: " + wordConfidences[i] + "%");
199
}
200
```
201

202
### Image Processing
203

204
Access processed images and thresholding results.
205

206
```java { .api }
207
public class TessBaseAPI {
208
    // Get processed binary image
209
    public PIX GetThresholdedImage();
210
    
211
    // Datapath information
212
    public String GetDatapath();
213
}
214
```
215

216
#### Usage Example
217

218
```java
219
api.SetImage(originalImage);
220

221
// Get the binary/thresholded image used for OCR
222
PIX thresholded = api.GetThresholdedImage();
223
pixWrite("/tmp/thresholded.png", thresholded, IFF_PNG);
224

225
// Cleanup
226
pixDestroy(thresholded);
227
```
228

229
### Batch Processing
230

231
Process multiple pages or documents efficiently.
232

233
```java { .api }
234
public class TessBaseAPI {
235
    // Process multiple pages with renderer pipeline
236
    public boolean ProcessPages(String filename, String retry_config, 
237
                               int timeout_millisec, TessResultRenderer renderer);
238
    
239
    // Process single page with renderer
240
    public boolean ProcessPage(PIX pix, int page_index, String filename, 
241
                              String retry_config, int timeout_millisec, 
242
                              TessResultRenderer renderer);
243
    
244
    // Clear previous results
245
    public void Clear();
246
}
247
```
248

249
#### Usage Example
250

251
```java
252
// Setup renderer chain for multiple output formats
253
TessResultRenderer textRenderer = TessTextRendererCreate("output");
254
TessResultRenderer pdfRenderer = TessPDFRendererCreate("output", "/usr/share/tessdata", false);
255
textRenderer.insert(pdfRenderer);
256

257
// Process multi-page document
258
boolean success = api.ProcessPages("document.pdf", null, 60000, textRenderer);
259

260
if (success) {
261
    System.out.println("Document processed successfully");
262
    // Output files: output.txt, output.pdf
263
}
264

265
// Cleanup renderers
266
TessDeleteResultRenderer(textRenderer);
267
```
268

269
## Error Handling
270

271
### Common Error Conditions
272

273
- **Initialization Failure**: Invalid tessdata path or missing language files
274
- **Image Loading**: Unsupported format or corrupted image data
275
- **Memory Issues**: Large images or insufficient system memory
276
- **Timeout**: Recognition takes longer than specified deadline
277

278
### Best Practices
279

280
```java
281
public class RobustOCR {
282
    public static String extractText(String imagePath) {
283
        TessBaseAPI api = new TessBaseAPI();
284
        PIX image = null;
285
        String result = null;
286
        
287
        try {
288
            // Initialize with error checking
289
            if (api.Init(null, "eng") != 0) {
290
                throw new RuntimeException("Tesseract initialization failed");
291
            }
292
            
293
            // Load image with validation
294
            image = pixRead(imagePath);
295
            if (image == null) {
296
                throw new RuntimeException("Failed to load image: " + imagePath);
297
            }
298
            
299
            // Set image and extract text
300
            api.SetImage(image);
301
            result = api.GetUTF8Text();
302
            
303
        } finally {
304
            // Always cleanup resources
305
            if (image != null) {
306
                pixDestroy(image);
307
            }
308
            api.End();
309
        }
310
        
311
        return result;
312
    }
313
}
314
```
315

316
## Types
317

318
### Progress Monitoring
319

320
```java { .api }
321
public class ETEXT_DESC {
322
    public short progress();           // Progress 0-100
323
    public boolean more_to_come();     // More work pending
324
    public boolean ocr_alive();        // Engine is active
325
    public byte err_code();            // Error code if failed
326
    public void set_deadline_msecs(int deadline_msecs);
327
    public boolean deadline_exceeded();
328
}
329
```
330

331
### Version Information
332

333
```java { .api }
334
// Tesseract version constants
335
public static final int TESSERACT_MAJOR_VERSION = 5;
336
public static final int TESSERACT_MINOR_VERSION = 5;
337
public static final int TESSERACT_MICRO_VERSION = 1;
338
public static final String TESSERACT_VERSION_STR = "5.5.1";
339
```

Version

Tile

Files

core-ocr-engine.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

core-ocr-engine.mddocs/