JavaCPP Presets for Tesseract - Java wrapper library providing JNI bindings to the native Tesseract OCR library version 5.5.1, enabling optical character recognition capabilities in Java applications
—
Extensive configuration system providing fine-grained control over OCR behavior including page segmentation modes, OCR engine modes, variable settings, language management, and performance tuning options.
Control how Tesseract analyzes page layout and identifies text regions.
// Page Segmentation Mode Constants
public static final int PSM_OSD_ONLY = 0; // Orientation and script detection only
public static final int PSM_AUTO_OSD = 1; // Automatic page segmentation with OSD
public static final int PSM_AUTO_ONLY = 2; // Automatic page segmentation, no OSD
public static final int PSM_AUTO = 3; // Fully automatic page segmentation (default)
public static final int PSM_SINGLE_COLUMN = 4; // Single column of text
public static final int PSM_SINGLE_BLOCK_VERT_TEXT = 5; // Single uniform block of vertical text
public static final int PSM_SINGLE_BLOCK = 6; // Single uniform block of text
public static final int PSM_SINGLE_LINE = 7; // Single text line
public static final int PSM_SINGLE_WORD = 8; // Single word
public static final int PSM_CIRCLE_WORD = 9; // Single word in a circle
public static final int PSM_SINGLE_CHAR = 10; // Single character
public static final int PSM_SPARSE_TEXT = 11; // Sparse text in no particular order
public static final int PSM_SPARSE_TEXT_OSD = 12; // Sparse text with OSD
public static final int PSM_RAW_LINE = 13; // Raw line, bypass word detection
/**
* Set page segmentation mode
* @param mode PSM constant (PSM_AUTO, PSM_SINGLE_BLOCK, etc.)
*/
public void SetPageSegMode(int mode);
/**
* Get current page segmentation mode
* @return Current PSM mode
*/
public int GetPageSegMode();Page Segmentation Example:
import static org.bytedeco.tesseract.global.tesseract.*;
TessBaseAPI api = new TessBaseAPI();
api.Init(null, "eng");
// Configure for single line of text (faster, more accurate for simple cases)
api.SetPageSegMode(PSM_SINGLE_LINE);
// Configure for automatic layout detection (good for complex documents)
api.SetPageSegMode(PSM_AUTO);
// Configure for single word (useful for form fields)
api.SetPageSegMode(PSM_SINGLE_WORD);
PIX image = pixRead("single-line.png");
api.SetImage(image);
BytePointer text = api.GetUTF8Text();
System.out.println("Text: " + text.getString());
text.deallocate();
pixDestroy(image);
api.End();Select the OCR engine and neural network configuration.
// OCR Engine Mode Constants
public static final int OEM_TESSERACT_ONLY = 0; // Legacy Tesseract only (deprecated)
public static final int OEM_LSTM_ONLY = 1; // LSTM neural network only (recommended)
public static final int OEM_TESSERACT_LSTM_COMBINED = 2; // Combined legacy + LSTM (deprecated)
public static final int OEM_DEFAULT = 3; // Default (auto-detect best available)
/**
* Initialize with specific OCR engine mode
* @param datapath Path to tessdata directory
* @param language Language code
* @param oem OCR Engine Mode
* @return 0 on success, -1 on failure
*/
public int Init(String datapath, String language, int oem);Engine Mode Selection Example:
// Use LSTM-only engine for best accuracy (recommended)
if (api.Init(null, "eng", OEM_LSTM_ONLY) != 0) {
System.err.println("Could not initialize with LSTM engine");
}
// Use default engine (auto-detect)
if (api.Init(null, "eng", OEM_DEFAULT) != 0) {
System.err.println("Could not initialize with default engine");
}
// Check available engines programmatically
// (engine availability depends on installed tessdata files)Control iterator navigation granularity for result analysis.
// Page Iterator Level Constants
public static final int RIL_BLOCK = 0; // Block of text/image/separator line
public static final int RIL_PARA = 1; // Paragraph within a block
public static final int RIL_TEXTLINE = 2; // Line within a paragraph
public static final int RIL_WORD = 3; // Word within a textline
public static final int RIL_SYMBOL = 4; // Symbol/character within a wordIdentify different types of page layout elements during analysis.
// Block Type Constants (PolyBlockType)
public static final int PT_UNKNOWN = 0; // Type is not yet known
public static final int PT_FLOWING_TEXT = 1; // Text that lives inside a column
public static final int PT_HEADING_TEXT = 2; // Text that spans more than one column
public static final int PT_PULLOUT_TEXT = 3; // Text in a cross-column pull-out region
public static final int PT_EQUATION = 4; // Partition belonging to an equation region
public static final int PT_INLINE_EQUATION = 5; // Partition has inline equation
public static final int PT_TABLE = 6; // Partition belonging to a table region
public static final int PT_VERTICAL_TEXT = 7; // Text-line runs vertically
public static final int PT_CAPTION_TEXT = 8; // Text that belongs to an image
public static final int PT_FLOWING_IMAGE = 9; // Image that lives inside a column
public static final int PT_HEADING_IMAGE = 10; // Image that spans more than one column
public static final int PT_PULLOUT_IMAGE = 11; // Image in a cross-column pull-out region
public static final int PT_HORZ_LINE = 12; // Horizontal Line
public static final int PT_VERT_LINE = 13; // Vertical Line
public static final int PT_NOISE = 14; // Lies outside of any column
public static final int PT_COUNT = 15; // Total number of block typesDocument orientation, writing direction, and text line ordering.
// Page Orientation Constants
public static final int ORIENTATION_PAGE_UP = 0; // Normal upright page
public static final int ORIENTATION_PAGE_RIGHT = 1; // Page rotated 90° clockwise
public static final int ORIENTATION_PAGE_DOWN = 2; // Page rotated 180°
public static final int ORIENTATION_PAGE_LEFT = 3; // Page rotated 90° counter-clockwise
// Writing Direction Constants
public static final int WRITING_DIRECTION_LEFT_TO_RIGHT = 0; // Left-to-right text (Latin, etc.)
public static final int WRITING_DIRECTION_RIGHT_TO_LEFT = 1; // Right-to-left text (Arabic, Hebrew)
public static final int WRITING_DIRECTION_TOP_TO_BOTTOM = 2; // Top-to-bottom text (Chinese, etc.)
// Text Line Order Constants
public static final int TEXTLINE_ORDER_LEFT_TO_RIGHT = 0; // Lines ordered left-to-right
public static final int TEXTLINE_ORDER_RIGHT_TO_LEFT = 1; // Lines ordered right-to-left
public static final int TEXTLINE_ORDER_TOP_TO_BOTTOM = 2; // Lines ordered top-to-bottom
// Text Justification Constants
public static final int JUSTIFICATION_UNKNOWN = 0; // Justification not determined
public static final int JUSTIFICATION_LEFT = 1; // Left-justified text
public static final int JUSTIFICATION_CENTER = 2; // Center-justified text
public static final int JUSTIFICATION_RIGHT = 3; // Right-justified text
// Script Direction Constants
public static final int DIR_NEUTRAL = 0; // Text contains only neutral characters
public static final int DIR_LEFT_TO_RIGHT = 1; // No right-to-left characters
public static final int DIR_RIGHT_TO_LEFT = 2; // No left-to-right characters
public static final int DIR_MIX = 3; // Mixed left-to-right and right-to-leftFine-tune OCR behavior using Tesseract's extensive variable system.
/**
* Set configuration variable
* @param name Variable name
* @param value Variable value as string
* @return true if variable was set successfully
*/
public boolean SetVariable(String name, String value);
/**
* Set debug-specific variable
* @param name Debug variable name
* @param value Variable value as string
* @return true if variable was set successfully
*/
public boolean SetDebugVariable(String name, String value);
/**
* Get integer variable value
* @param name Variable name
* @param value Output: variable value
* @return true if variable exists
*/
public boolean GetIntVariable(String name, IntPointer value);
/**
* Get boolean variable value
* @param name Variable name
* @param value Output: variable value
* @return true if variable exists
*/
public boolean GetBoolVariable(String name, BoolPointer value);
/**
* Get double variable value
* @param name Variable name
* @param value Output: variable value
* @return true if variable exists
*/
public boolean GetDoubleVariable(String name, DoublePointer value);
/**
* Get string variable value
* @param name Variable name
* @return Variable value or null if not found
*/
public String GetStringVariable(String name);Variable Configuration Examples:
TessBaseAPI api = new TessBaseAPI();
api.Init(null, "eng");
// Character blacklist (ignore specific characters)
api.SetVariable("tessedit_char_blacklist", "xyz");
// Character whitelist (only recognize specific characters)
api.SetVariable("tessedit_char_whitelist", "0123456789");
// Numeric-only mode
api.SetVariable("classify_bln_numeric_mode", "1");
// Minimum word confidence threshold
api.SetVariable("tessedit_reject_mode", "2");
// Preserve spaces in output
api.SetVariable("preserve_interword_spaces", "1");
// Enable/disable dictionary checking
api.SetVariable("load_system_dawg", "0"); // Disable system dictionary
api.SetVariable("load_freq_dawg", "0"); // Disable frequency dictionary
api.SetVariable("load_unambig_dawg", "0"); // Disable unambiguous dictionary
// Performance tuning
api.SetVariable("tessedit_pageseg_mode", "6"); // Alternative to SetPageSegMode()
api.SetVariable("textord_min_linesize", "2.5"); // Minimum line size
// Debug output
api.SetDebugVariable("tessedit_write_images", "1"); // Save debug images
api.SetDebugVariable("textord_debug_tabfind", "1"); // Debug table finding
// Process image with custom configuration
PIX image = pixRead("numbers-only.png");
api.SetImage(image);
BytePointer text = api.GetUTF8Text();
System.out.println("Numbers: " + text.getString());
// Check current variable values
IntPointer numericMode = new IntPointer(1);
if (api.GetIntVariable("classify_bln_numeric_mode", numericMode)) {
System.out.println("Numeric mode: " + numericMode.get());
}
text.deallocate();
pixDestroy(image);
api.End();Multi-language support and language detection configuration.
/**
* Get initialized languages as string
* @return Comma-separated list of initialized languages
*/
public String GetInitLanguagesAsString();
/**
* Get loaded languages into vector
* @param langs Output vector to populate with language codes
*/
public void GetLoadedLanguagesAsVector(StringVector langs);
/**
* Get available languages into vector
* @param langs Output vector to populate with available language codes
*/
public void GetAvailableLanguagesAsVector(StringVector langs);Multi-language Examples:
// Initialize with multiple languages
TessBaseAPI api = new TessBaseAPI();
if (api.Init(null, "eng+fra+deu") != 0) { // English + French + German
System.err.println("Could not initialize multi-language");
}
// Check what languages are loaded
String loadedLangs = api.GetInitLanguagesAsString();
System.out.println("Loaded languages: " + loadedLangs);
// Get available languages
StringVector availableLangs = new StringVector();
api.GetAvailableLanguagesAsVector(availableLangs);
System.out.println("Available languages:");
for (int i = 0; i < availableLangs.size(); i++) {
System.out.println(" " + availableLangs.get(i).getString());
}
// Process multilingual document
PIX image = pixRead("multilingual-doc.png");
api.SetImage(image);
BytePointer text = api.GetUTF8Text();
System.out.println("Multilingual text: " + text.getString());
text.deallocate();
pixDestroy(image);
api.End();Helper functions for testing configuration modes and engine capabilities.
// PSM Testing Functions
public static boolean PSM_OSD_ENABLED(int pageseg_mode); // Test if OSD enabled
public static boolean PSM_ORIENTATION_ENABLED(int mode); // Test if orientation detection enabled
public static boolean PSM_COL_FIND_ENABLED(int mode); // Test if column finding enabled
public static boolean PSM_SPARSE(int mode); // Test if sparse mode
public static boolean PSM_BLOCK_FIND_ENABLED(int mode); // Test if block finding enabled
public static boolean PSM_LINE_FIND_ENABLED(int mode); // Test if line finding enabled
public static boolean PSM_WORD_FIND_ENABLED(int mode); // Test if word finding enabled
// PolyBlock Type Testing Functions
public static boolean PTIsLineType(int type); // Test if PolyBlockType is line
public static boolean PTIsImageType(int type); // Test if PolyBlockType is image
public static boolean PTIsTextType(int type); // Test if PolyBlockType is text
public static boolean PTIsPulloutType(int type); // Test if PolyBlockType is pulloutConfiguration Testing Examples:
import static org.bytedeco.tesseract.global.tesseract.*;
// Test if a PSM mode supports specific features
int psm = PSM_AUTO;
if (PSM_OSD_ENABLED(psm)) {
System.out.println("Orientation and script detection enabled");
}
if (PSM_WORD_FIND_ENABLED(psm)) {
System.out.println("Word-level analysis enabled");
}
if (PSM_SPARSE(psm)) {
System.out.println("Sparse text mode enabled");
}
// Test block types during iteration
PageIterator pi = api.AnalyseLayout();
pi.Begin();
do {
int blockType = pi.BlockType();
if (PTIsTextType(blockType)) {
System.out.println("Found text block");
} else if (PTIsImageType(blockType)) {
System.out.println("Found image block");
} else if (PTIsLineType(blockType)) {
System.out.println("Found line block");
}
} while (pi.Next(RIL_BLOCK));// Optimize for form processing
api.SetPageSegMode(PSM_SINGLE_BLOCK);
api.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789");
api.SetVariable("preserve_interword_spaces", "1");// Optimize for numeric content
api.SetPageSegMode(PSM_SINGLE_LINE);
api.SetVariable("classify_bln_numeric_mode", "1");
api.SetVariable("tessedit_char_whitelist", "0123456789.-");// Settings for low-quality scans
api.SetVariable("tessedit_reject_mode", "0"); // Don't reject low-confidence words
api.SetVariable("textord_min_linesize", "1.0"); // Accept smaller text
api.SetVariable("edges_max_children_per_outline", "50"); // More edge detection// Maximum accuracy (slower processing)
api.SetPageSegMode(PSM_AUTO_OSD); // Full layout analysis with orientation detection
api.SetVariable("tessedit_enable_dict_correction", "1"); // Dictionary correction
api.SetVariable("classify_enable_learning", "1"); // Enable learning
api.SetVariable("classify_enable_adaptive_matcher", "1"); // Adaptive matching// Faster processing (lower accuracy)
api.SetPageSegMode(PSM_SINGLE_BLOCK);
api.SetVariable("load_system_dawg", "0"); // Skip dictionary loading
api.SetVariable("load_freq_dawg", "0");
api.SetVariable("tessedit_enable_dict_correction", "0");
api.SetVariable("classify_enable_learning", "0");textord_min_linesize - Minimum line size thresholdtextord_max_noise_size - Maximum noise blob sizeedges_max_children_per_outline - Edge detection sensitivitytextord_debug_tabfind - Table detection debuggingclassify_bln_numeric_mode - Numeric-only recognition modeclassify_enable_learning - Enable adaptive learningclassify_enable_adaptive_matcher - Use adaptive matchingtessedit_enable_dict_correction - Dictionary-based correctiontessedit_char_blacklist - Characters to ignoretessedit_char_whitelist - Only recognize these characterspreserve_interword_spaces - Maintain spacing in outputtessedit_reject_mode - Word rejection strategytessedit_write_images - Save intermediate processing imagestessedit_dump_pageseg_images - Save page segmentation debug imagesclassify_debug_level - Character classification debug levelInstall with Tessl CLI
npx tessl i tessl/maven-org-bytedeco--tesseract