0
# JavaCPP Tesseract
1
2
JavaCPP Tesseract provides Java bindings for the Tesseract OCR (Optical Character Recognition) library version 5.5.1. It enables Java applications to perform text extraction from images with high accuracy through native JNI bindings to the C++ Tesseract library. The package supports multiple output formats, detailed result analysis, and extensive configuration options.
3
4
## Package Information
5
6
- **Package Name**: org.bytedeco:tesseract
7
- **Package Type**: Maven
8
- **Language**: Java
9
- **Installation**: `<dependency><groupId>org.bytedeco</groupId><artifactId>tesseract-platform</artifactId><version>5.5.1-1.5.12</version></dependency>`
10
11
## Core Imports
12
13
```java
14
import org.bytedeco.tesseract.*;
15
import org.bytedeco.leptonica.*;
16
import static org.bytedeco.tesseract.global.tesseract.*;
17
import static org.bytedeco.leptonica.global.leptonica.*;
18
```
19
20
## Basic Usage
21
22
```java
23
import org.bytedeco.javacpp.*;
24
import org.bytedeco.leptonica.*;
25
import org.bytedeco.tesseract.*;
26
import static org.bytedeco.leptonica.global.leptonica.*;
27
import static org.bytedeco.tesseract.global.tesseract.*;
28
29
// Initialize Tesseract
30
TessBaseAPI api = new TessBaseAPI();
31
if (api.Init(null, "eng") != 0) {
32
System.err.println("Could not initialize Tesseract.");
33
System.exit(1);
34
}
35
36
// Load and process image
37
PIX image = pixRead("image.png");
38
api.SetImage(image);
39
40
// Get OCR result
41
BytePointer text = api.GetUTF8Text();
42
System.out.println("OCR output: " + text.getString());
43
44
// Cleanup
45
api.End();
46
text.deallocate();
47
pixDestroy(image);
48
```
49
50
## Architecture
51
52
JavaCPP Tesseract is built around several key components:
53
54
- **TessBaseAPI**: Main OCR engine interface providing initialization, configuration, and text extraction
55
- **Iterator System**: Hierarchical result navigation (PageIterator, ResultIterator, ChoiceIterator) for detailed analysis
56
- **Renderer Framework**: Output format generators for various formats (Text, hOCR, PDF, TSV, etc.)
57
- **JavaCPP Integration**: Native memory management and JNI bindings with automatic cleanup
58
- **Leptonica Integration**: Image processing capabilities through the Leptonica library
59
60
## Capabilities
61
62
### Basic OCR Operations
63
64
Core text recognition functionality for extracting text from images. Supports multiple languages, page segmentation modes, and confidence scoring.
65
66
```java { .api }
67
public class TessBaseAPI extends Pointer {
68
public int Init(String datapath, String language);
69
public void SetImage(PIX pix);
70
public void SetImage(byte[] imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line);
71
public BytePointer GetUTF8Text();
72
public int MeanTextConf();
73
public void End();
74
}
75
```
76
77
[Basic OCR Operations](./basic-ocr.md)
78
79
### Result Analysis with Iterators
80
81
Detailed analysis of OCR results including word-level confidence, bounding boxes, font information, and hierarchical page structure navigation.
82
83
```java { .api }
84
public class ResultIterator extends LTRResultIterator {
85
public void Begin();
86
public boolean Next(int level);
87
public BytePointer GetUTF8Text(int level);
88
public float Confidence(int level);
89
public boolean BoundingBox(int level, IntPointer left, IntPointer top, IntPointer right, IntPointer bottom);
90
}
91
92
public class ChoiceIterator extends Pointer {
93
public ChoiceIterator(LTRResultIterator result_it);
94
public boolean Next();
95
public BytePointer GetUTF8Text();
96
public float Confidence();
97
}
98
```
99
100
[Result Analysis with Iterators](./iterators.md)
101
102
### Output Format Renderers
103
104
Multi-format output generation including plain text, hOCR HTML, searchable PDF, ALTO XML, and TSV formats for various integration needs.
105
106
```java { .api }
107
public abstract class TessResultRenderer extends Pointer {
108
public boolean BeginDocument(String title);
109
public boolean AddImage(TessBaseAPI api);
110
public boolean EndDocument();
111
}
112
113
public class TessTextRenderer extends TessResultRenderer {
114
public TessTextRenderer(String outputbase);
115
}
116
117
public class TessPDFRenderer extends TessResultRenderer {
118
public TessPDFRenderer(String outputbase, String datadir);
119
public TessPDFRenderer(String outputbase, String datadir, boolean textonly);
120
}
121
```
122
123
[Output Format Renderers](./renderers.md)
124
125
### Configuration and Parameters
126
127
Extensive configuration options including page segmentation modes, OCR engine modes, variable settings, and language management.
128
129
```java { .api }
130
// Page Segmentation Modes
131
public static final int PSM_AUTO = 3; // Fully automatic page segmentation
132
public static final int PSM_SINGLE_BLOCK = 6; // Single uniform block of text
133
public static final int PSM_SINGLE_LINE = 7; // Single text line
134
public static final int PSM_SINGLE_WORD = 8; // Single word
135
136
// OCR Engine Modes
137
public static final int OEM_LSTM_ONLY = 1; // LSTM only
138
public static final int OEM_DEFAULT = 3; // Default (auto-detect)
139
140
// Configuration Methods
141
public void SetPageSegMode(int mode);
142
public boolean SetVariable(String name, String value);
143
public boolean GetIntVariable(String name, IntPointer value);
144
```
145
146
[Configuration and Parameters](./configuration.md)
147
148
### Data Structures and Types
149
150
Supporting data types for progress monitoring, character information, Unicode handling, and container classes.
151
152
```java { .api }
153
public class ETEXT_DESC extends Pointer {
154
public short progress(); // Progress percentage (0-100)
155
public byte ocr_alive(); // OCR alive flag
156
public void set_deadline_msecs(int deadline_msecs);
157
public boolean deadline_exceeded();
158
}
159
160
public class UNICHAR extends Pointer {
161
public UNICHAR(String utf8_str, int len);
162
public int first_uni(); // Get first character as UCS-4
163
public BytePointer utf8_str(); // Get terminated UTF-8 string
164
}
165
166
public class StringVector extends Pointer {
167
public StringVector(String... array);
168
public long size();
169
public BytePointer get(long i);
170
public StringVector push_back(String value);
171
}
172
```
173
174
[Data Structures and Types](./data-structures.md)
175
176
## Common Usage Patterns
177
178
### Simple Text Extraction
179
For basic OCR operations where you just need the text content.
180
181
### Detailed Analysis
182
When you need word-level confidence scores, bounding boxes, or font information.
183
184
### Batch Processing
185
Processing multiple images with consistent output formatting using renderers.
186
187
### Custom Configuration
188
Fine-tuning OCR behavior for specific document types or languages.
189
190
## Error Handling
191
192
The Tesseract library uses return codes and status flags for error handling:
193
- `Init()` returns 0 on success, non-zero on failure
194
- Always call `End()` to cleanup resources
195
- Use `deallocate()` on BytePointer objects to prevent memory leaks
196
- Check iterator validity before navigation