JavaCPP Presets for Tesseract - Java wrapper library providing JNI bindings to the native Tesseract OCR library version 5.5.1, enabling optical character recognition capabilities in Java applications
npx @tessl/cli install tessl/maven-org-bytedeco--tesseract@5.5.0JavaCPP Tesseract provides Java bindings for the Tesseract OCR (Optical Character Recognition) library version 5.5.1. It enables Java applications to perform text extraction from images with high accuracy through native JNI bindings to the C++ Tesseract library. The package supports multiple output formats, detailed result analysis, and extensive configuration options.
<dependency><groupId>org.bytedeco</groupId><artifactId>tesseract-platform</artifactId><version>5.5.1-1.5.12</version></dependency>import org.bytedeco.tesseract.*;
import org.bytedeco.leptonica.*;
import static org.bytedeco.tesseract.global.tesseract.*;
import static org.bytedeco.leptonica.global.leptonica.*;import org.bytedeco.javacpp.*;
import org.bytedeco.leptonica.*;
import org.bytedeco.tesseract.*;
import static org.bytedeco.leptonica.global.leptonica.*;
import static org.bytedeco.tesseract.global.tesseract.*;
// Initialize Tesseract
TessBaseAPI api = new TessBaseAPI();
if (api.Init(null, "eng") != 0) {
System.err.println("Could not initialize Tesseract.");
System.exit(1);
}
// Load and process image
PIX image = pixRead("image.png");
api.SetImage(image);
// Get OCR result
BytePointer text = api.GetUTF8Text();
System.out.println("OCR output: " + text.getString());
// Cleanup
api.End();
text.deallocate();
pixDestroy(image);JavaCPP Tesseract is built around several key components:
Core text recognition functionality for extracting text from images. Supports multiple languages, page segmentation modes, and confidence scoring.
public class TessBaseAPI extends Pointer {
public int Init(String datapath, String language);
public void SetImage(PIX pix);
public void SetImage(byte[] imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line);
public BytePointer GetUTF8Text();
public int MeanTextConf();
public void End();
}Detailed analysis of OCR results including word-level confidence, bounding boxes, font information, and hierarchical page structure navigation.
public class ResultIterator extends LTRResultIterator {
public void Begin();
public boolean Next(int level);
public BytePointer GetUTF8Text(int level);
public float Confidence(int level);
public boolean BoundingBox(int level, IntPointer left, IntPointer top, IntPointer right, IntPointer bottom);
}
public class ChoiceIterator extends Pointer {
public ChoiceIterator(LTRResultIterator result_it);
public boolean Next();
public BytePointer GetUTF8Text();
public float Confidence();
}Result Analysis with Iterators
Multi-format output generation including plain text, hOCR HTML, searchable PDF, ALTO XML, and TSV formats for various integration needs.
public abstract class TessResultRenderer extends Pointer {
public boolean BeginDocument(String title);
public boolean AddImage(TessBaseAPI api);
public boolean EndDocument();
}
public class TessTextRenderer extends TessResultRenderer {
public TessTextRenderer(String outputbase);
}
public class TessPDFRenderer extends TessResultRenderer {
public TessPDFRenderer(String outputbase, String datadir);
public TessPDFRenderer(String outputbase, String datadir, boolean textonly);
}Extensive configuration options including page segmentation modes, OCR engine modes, variable settings, and language management.
// Page Segmentation Modes
public static final int PSM_AUTO = 3; // Fully automatic page segmentation
public static final int PSM_SINGLE_BLOCK = 6; // Single uniform block of text
public static final int PSM_SINGLE_LINE = 7; // Single text line
public static final int PSM_SINGLE_WORD = 8; // Single word
// OCR Engine Modes
public static final int OEM_LSTM_ONLY = 1; // LSTM only
public static final int OEM_DEFAULT = 3; // Default (auto-detect)
// Configuration Methods
public void SetPageSegMode(int mode);
public boolean SetVariable(String name, String value);
public boolean GetIntVariable(String name, IntPointer value);Supporting data types for progress monitoring, character information, Unicode handling, and container classes.
public class ETEXT_DESC extends Pointer {
public short progress(); // Progress percentage (0-100)
public byte ocr_alive(); // OCR alive flag
public void set_deadline_msecs(int deadline_msecs);
public boolean deadline_exceeded();
}
public class UNICHAR extends Pointer {
public UNICHAR(String utf8_str, int len);
public int first_uni(); // Get first character as UCS-4
public BytePointer utf8_str(); // Get terminated UTF-8 string
}
public class StringVector extends Pointer {
public StringVector(String... array);
public long size();
public BytePointer get(long i);
public StringVector push_back(String value);
}For basic OCR operations where you just need the text content.
When you need word-level confidence scores, bounding boxes, or font information.
Processing multiple images with consistent output formatting using renderers.
Fine-tuning OCR behavior for specific document types or languages.
The Tesseract library uses return codes and status flags for error handling:
Init() returns 0 on success, non-zero on failureEnd() to cleanup resourcesdeallocate() on BytePointer objects to prevent memory leaks