or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

basic-ocr.mdconfiguration.mddata-structures.mdindex.mditerators.mdrenderers.md
tile.json

tessl/maven-org-bytedeco--tesseract

JavaCPP Presets for Tesseract - Java wrapper library providing JNI bindings to the native Tesseract OCR library version 5.5.1, enabling optical character recognition capabilities in Java applications

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.bytedeco/tesseract@5.5.x

To install, run

npx @tessl/cli install tessl/maven-org-bytedeco--tesseract@5.5.0

index.mddocs/

JavaCPP Tesseract

JavaCPP Tesseract provides Java bindings for the Tesseract OCR (Optical Character Recognition) library version 5.5.1. It enables Java applications to perform text extraction from images with high accuracy through native JNI bindings to the C++ Tesseract library. The package supports multiple output formats, detailed result analysis, and extensive configuration options.

Package Information

  • Package Name: org.bytedeco:tesseract
  • Package Type: Maven
  • Language: Java
  • Installation: <dependency><groupId>org.bytedeco</groupId><artifactId>tesseract-platform</artifactId><version>5.5.1-1.5.12</version></dependency>

Core Imports

import org.bytedeco.tesseract.*;
import org.bytedeco.leptonica.*;
import static org.bytedeco.tesseract.global.tesseract.*;
import static org.bytedeco.leptonica.global.leptonica.*;

Basic Usage

import org.bytedeco.javacpp.*;
import org.bytedeco.leptonica.*;
import org.bytedeco.tesseract.*;
import static org.bytedeco.leptonica.global.leptonica.*;
import static org.bytedeco.tesseract.global.tesseract.*;

// Initialize Tesseract
TessBaseAPI api = new TessBaseAPI();
if (api.Init(null, "eng") != 0) {
    System.err.println("Could not initialize Tesseract.");
    System.exit(1);
}

// Load and process image
PIX image = pixRead("image.png");
api.SetImage(image);

// Get OCR result
BytePointer text = api.GetUTF8Text();
System.out.println("OCR output: " + text.getString());

// Cleanup
api.End();
text.deallocate();
pixDestroy(image);

Architecture

JavaCPP Tesseract is built around several key components:

  • TessBaseAPI: Main OCR engine interface providing initialization, configuration, and text extraction
  • Iterator System: Hierarchical result navigation (PageIterator, ResultIterator, ChoiceIterator) for detailed analysis
  • Renderer Framework: Output format generators for various formats (Text, hOCR, PDF, TSV, etc.)
  • JavaCPP Integration: Native memory management and JNI bindings with automatic cleanup
  • Leptonica Integration: Image processing capabilities through the Leptonica library

Capabilities

Basic OCR Operations

Core text recognition functionality for extracting text from images. Supports multiple languages, page segmentation modes, and confidence scoring.

public class TessBaseAPI extends Pointer {
    public int Init(String datapath, String language);
    public void SetImage(PIX pix);
    public void SetImage(byte[] imagedata, int width, int height, int bytes_per_pixel, int bytes_per_line);
    public BytePointer GetUTF8Text();
    public int MeanTextConf();
    public void End();
}

Basic OCR Operations

Result Analysis with Iterators

Detailed analysis of OCR results including word-level confidence, bounding boxes, font information, and hierarchical page structure navigation.

public class ResultIterator extends LTRResultIterator {
    public void Begin();
    public boolean Next(int level);
    public BytePointer GetUTF8Text(int level);
    public float Confidence(int level);
    public boolean BoundingBox(int level, IntPointer left, IntPointer top, IntPointer right, IntPointer bottom);
}

public class ChoiceIterator extends Pointer {
    public ChoiceIterator(LTRResultIterator result_it);
    public boolean Next();
    public BytePointer GetUTF8Text();
    public float Confidence();
}

Result Analysis with Iterators

Output Format Renderers

Multi-format output generation including plain text, hOCR HTML, searchable PDF, ALTO XML, and TSV formats for various integration needs.

public abstract class TessResultRenderer extends Pointer {
    public boolean BeginDocument(String title);
    public boolean AddImage(TessBaseAPI api);
    public boolean EndDocument();
}

public class TessTextRenderer extends TessResultRenderer {
    public TessTextRenderer(String outputbase);
}

public class TessPDFRenderer extends TessResultRenderer {
    public TessPDFRenderer(String outputbase, String datadir);
    public TessPDFRenderer(String outputbase, String datadir, boolean textonly);
}

Output Format Renderers

Configuration and Parameters

Extensive configuration options including page segmentation modes, OCR engine modes, variable settings, and language management.

// Page Segmentation Modes
public static final int PSM_AUTO = 3;              // Fully automatic page segmentation
public static final int PSM_SINGLE_BLOCK = 6;      // Single uniform block of text
public static final int PSM_SINGLE_LINE = 7;       // Single text line
public static final int PSM_SINGLE_WORD = 8;       // Single word

// OCR Engine Modes  
public static final int OEM_LSTM_ONLY = 1;         // LSTM only
public static final int OEM_DEFAULT = 3;           // Default (auto-detect)

// Configuration Methods
public void SetPageSegMode(int mode);
public boolean SetVariable(String name, String value);
public boolean GetIntVariable(String name, IntPointer value);

Configuration and Parameters

Data Structures and Types

Supporting data types for progress monitoring, character information, Unicode handling, and container classes.

public class ETEXT_DESC extends Pointer {
    public short progress();                        // Progress percentage (0-100)
    public byte ocr_alive();                       // OCR alive flag
    public void set_deadline_msecs(int deadline_msecs);
    public boolean deadline_exceeded();
}

public class UNICHAR extends Pointer {
    public UNICHAR(String utf8_str, int len);
    public int first_uni();                        // Get first character as UCS-4
    public BytePointer utf8_str();                 // Get terminated UTF-8 string
}

public class StringVector extends Pointer {
    public StringVector(String... array);
    public long size();
    public BytePointer get(long i);
    public StringVector push_back(String value);
}

Data Structures and Types

Common Usage Patterns

Simple Text Extraction

For basic OCR operations where you just need the text content.

Detailed Analysis

When you need word-level confidence scores, bounding boxes, or font information.

Batch Processing

Processing multiple images with consistent output formatting using renderers.

Custom Configuration

Fine-tuning OCR behavior for specific document types or languages.

Error Handling

The Tesseract library uses return codes and status flags for error handling:

  • Init() returns 0 on success, non-zero on failure
  • Always call End() to cleanup resources
  • Use deallocate() on BytePointer objects to prevent memory leaks
  • Check iterator validity before navigation