Tessl Tile for maven/org.bytedeco/tesseract@5.5.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

basic-ocr.md configuration.md data-structures.md index.md iterators.md renderers.md

index.mddocs/

0
# JavaCPP Tesseract
1

2
JavaCPP Tesseract provides Java bindings for the Tesseract OCR (Optical Character Recognition) library version 5.5.1. It enables Java applications to perform text extraction from images with high accuracy through native JNI bindings to the C++ Tesseract library. The package supports multiple output formats, detailed result analysis, and extensive configuration options.
3

4
## Package Information
5

6
- **Package Name**: org.bytedeco:tesseract
7
- **Package Type**: Maven
8
- **Language**: Java
9
- **Installation**: `<dependency><groupId>org.bytedeco</groupId><artifactId>tesseract-platform</artifactId><version>5.5.1-1.5.12</version></dependency>`
10

11
## Core Imports
12

13
```java
14
import org.bytedeco.tesseract.*;
15
import org.bytedeco.leptonica.*;
16
import static org.bytedeco.tesseract.global.tesseract.*;
17
import static org.bytedeco.leptonica.global.leptonica.*;
18
```
19

20
## Basic Usage
21

22
```java
23
import org.bytedeco.javacpp.*;
24
import org.bytedeco.leptonica.*;
25
import org.bytedeco.tesseract.*;
26
import static org.bytedeco.leptonica.global.leptonica.*;
27
import static org.bytedeco.tesseract.global.tesseract.*;
28

29
// Initialize Tesseract
30
TessBaseAPI api = new TessBaseAPI();
31
if (api.Init(null, "eng") != 0) {
32
    System.err.println("Could not initialize Tesseract.");
33
    System.exit(1);
34
}
35

36
// Load and process image
37
PIX image = pixRead("image.png");
38
api.SetImage(image);
39

40
// Get OCR result
41
BytePointer text = api.GetUTF8Text();
42
System.out.println("OCR output: " + text.getString());
43

44
// Cleanup
45
api.End();
46
text.deallocate();
47
pixDestroy(image);
48
```
49

50
## Architecture
51

52
JavaCPP Tesseract is built around several key components:
53

54
- **TessBaseAPI**: Main OCR engine interface providing initialization, configuration, and text extraction
55
- **Iterator System**: Hierarchical result navigation (PageIterator, ResultIterator, ChoiceIterator) for detailed analysis
56
- **Renderer Framework**: Output format generators for various formats (Text, hOCR, PDF, TSV, etc.)
57
- **JavaCPP Integration**: Native memory management and JNI bindings with automatic cleanup
58
- **Leptonica Integration**: Image processing capabilities through the Leptonica library
59

60
## Capabilities
61

62
### Basic OCR Operations
63

64
Core text recognition functionality for extracting text from images. Supports multiple languages, page segmentation modes, and confidence scoring.
65

66
```java { .api }
67
public class TessBaseAPI extends Pointer {
68
    public int Init(String datapath, String language);
69
    public void SetImage(PIX pix);
70
    public void SetImage(byte[] imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line);
71
    public BytePointer GetUTF8Text();
72
    public int MeanTextConf();
73
    public void End();
74
}
75
```
76

77
[Basic OCR Operations](./basic-ocr.md)
78

79
### Result Analysis with Iterators
80

81
Detailed analysis of OCR results including word-level confidence, bounding boxes, font information, and hierarchical page structure navigation.
82

83
```java { .api }
84
public class ResultIterator extends LTRResultIterator {
85
    public void Begin();
86
    public boolean Next(int level);
87
    public BytePointer GetUTF8Text(int level);
88
    public float Confidence(int level);
89
    public boolean BoundingBox(int level, IntPointer left, IntPointer top, IntPointer right, IntPointer bottom);
90
}
91

92
public class ChoiceIterator extends Pointer {
93
    public ChoiceIterator(LTRResultIterator result_it);
94
    public boolean Next();
95
    public BytePointer GetUTF8Text();
96
    public float Confidence();
97
}
98
```
99

100
[Result Analysis with Iterators](./iterators.md)
101

102
### Output Format Renderers
103

104
Multi-format output generation including plain text, hOCR HTML, searchable PDF, ALTO XML, and TSV formats for various integration needs.
105

106
```java { .api }
107
public abstract class TessResultRenderer extends Pointer {
108
    public boolean BeginDocument(String title);
109
    public boolean AddImage(TessBaseAPI api);
110
    public boolean EndDocument();
111
}
112

113
public class TessTextRenderer extends TessResultRenderer {
114
    public TessTextRenderer(String outputbase);
115
}
116

117
public class TessPDFRenderer extends TessResultRenderer {
118
    public TessPDFRenderer(String outputbase, String datadir);
119
    public TessPDFRenderer(String outputbase, String datadir, boolean textonly);
120
}
121
```
122

123
[Output Format Renderers](./renderers.md)
124

125
### Configuration and Parameters
126

127
Extensive configuration options including page segmentation modes, OCR engine modes, variable settings, and language management.
128

129
```java { .api }
130
// Page Segmentation Modes
131
public static final int PSM_AUTO = 3;              // Fully automatic page segmentation
132
public static final int PSM_SINGLE_BLOCK = 6;      // Single uniform block of text
133
public static final int PSM_SINGLE_LINE = 7;       // Single text line
134
public static final int PSM_SINGLE_WORD = 8;       // Single word
135

136
// OCR Engine Modes  
137
public static final int OEM_LSTM_ONLY = 1;         // LSTM only
138
public static final int OEM_DEFAULT = 3;           // Default (auto-detect)
139

140
// Configuration Methods
141
public void SetPageSegMode(int mode);
142
public boolean SetVariable(String name, String value);
143
public boolean GetIntVariable(String name, IntPointer value);
144
```
145

146
[Configuration and Parameters](./configuration.md)
147

148
### Data Structures and Types
149

150
Supporting data types for progress monitoring, character information, Unicode handling, and container classes.
151

152
```java { .api }
153
public class ETEXT_DESC extends Pointer {
154
    public short progress();                        // Progress percentage (0-100)
155
    public byte ocr_alive();                       // OCR alive flag
156
    public void set_deadline_msecs(int deadline_msecs);
157
    public boolean deadline_exceeded();
158
}
159

160
public class UNICHAR extends Pointer {
161
    public UNICHAR(String utf8_str, int len);
162
    public int first_uni();                        // Get first character as UCS-4
163
    public BytePointer utf8_str();                 // Get terminated UTF-8 string
164
}
165

166
public class StringVector extends Pointer {
167
    public StringVector(String... array);
168
    public long size();
169
    public BytePointer get(long i);
170
    public StringVector push_back(String value);
171
}
172
```
173

174
[Data Structures and Types](./data-structures.md)
175

176
## Common Usage Patterns
177

178
### Simple Text Extraction
179
For basic OCR operations where you just need the text content.
180

181
### Detailed Analysis
182
When you need word-level confidence scores, bounding boxes, or font information.
183

184
### Batch Processing
185
Processing multiple images with consistent output formatting using renderers.
186

187
### Custom Configuration
188
Fine-tuning OCR behavior for specific document types or languages.
189

190
## Error Handling
191

192
The Tesseract library uses return codes and status flags for error handling:
193
- `Init()` returns 0 on success, non-zero on failure
194
- Always call `End()` to cleanup resources
195
- Use `deallocate()` on BytePointer objects to prevent memory leaks
196
- Check iterator validity before navigation

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/