tessl/npm-tesseract-js

Pure JavaScript multilingual OCR library that brings the powerful Tesseract OCR engine to both browser and Node.js environments through WebAssembly

—

Pending

Overview

Eval results

Files

High-Level Functions

Name: tessl/npm-tesseract-js
Author: tessl

High-level functions provide convenient one-shot OCR operations without requiring manual worker management. They automatically create workers, perform the requested operation, and clean up resources, making them ideal for simple use cases or prototyping.

Capabilities

Text Recognition

Performs text recognition on an image with automatic worker lifecycle management.

/**
 * Recognizes text from an image using automatic worker management
 * @param image - Image input in various supported formats
 * @param langs - Language code(s) for recognition (default: 'eng')
 * @param options - Worker configuration options
 * @returns Promise resolving to recognition results
 */
function recognize(
  image: ImageLike, 
  langs?: string, 
  options?: Partial<WorkerOptions>
): Promise<RecognizeResult>;

Usage Examples:

import { recognize } from 'tesseract.js';

// Basic recognition with English
const result = await recognize('https://example.com/image.png');
console.log(result.data.text);

// Recognition with specific language
const frenchResult = await recognize('french-document.jpg', 'fra');
console.log(frenchResult.data.text);

// Recognition with multiple languages
const multiResult = await recognize('mixed-text.png', 'eng+fra+deu');
console.log(multiResult.data.text);

// Recognition with custom options
const customResult = await recognize('image.png', 'eng', {
  logger: m => console.log(`Progress: ${m.progress}%`)
});

Orientation and Script Detection

Detects text orientation and script information from an image with automatic worker management.

/**
 * Detects orientation and script from an image using automatic worker management
 * @param image - Image input in various supported formats
 * @param options - Worker configuration options
 * @returns Promise resolving to detection results
 */
function detect(
  image: ImageLike, 
  options?: Partial<WorkerOptions>
): Promise<DetectResult>;

Usage Examples:

import { detect } from 'tesseract.js';

// Basic detection
const detection = await detect('rotated-image.png');
console.log(`Orientation: ${detection.data.orientation_degrees}°`);
console.log(`Script: ${detection.data.script}`);
console.log(`Script confidence: ${detection.data.script_confidence}`);

// Detection with custom options
const customDetection = await detect('image.png', {
  logger: m => console.log(`Detection progress: ${m.progress}%`)
});

Comparison with Worker API

When to Use High-Level Functions

Best for:

Quick prototyping and testing
Single image processing
Simple scripts and demos
Cases where performance is not critical
One-off OCR operations

Example scenarios:

// Quick text extraction from a single image
const text = (await recognize('receipt.jpg')).data.text;

// Check if an image contains rotated text
const isRotated = (await detect('document.png')).data.orientation_degrees !== 0;

When to Use Worker API

Best for:

Processing multiple images
Applications requiring configuration changes
Performance-critical applications
Advanced parameter tuning
Resource-intensive operations

Example scenarios:

// Process many images efficiently
const worker = await createWorker('eng');
for (const image of images) {
  const result = await worker.recognize(image);
  // Process result
}
await worker.terminate();

Complete Usage Examples

Document Processing Pipeline

import { recognize, detect } from 'tesseract.js';

async function processDocument(imagePath) {
  try {
    // First detect orientation
    const detection = await detect(imagePath);
    
    if (detection.data.orientation_degrees !== 0) {
      console.log(`Image is rotated ${detection.data.orientation_degrees}°`);
    }
    
    // Then recognize text
    const result = await recognize(imagePath, 'eng');
    
    return {
      text: result.data.text,
      confidence: result.data.confidence,
      orientation: detection.data.orientation_degrees,
      script: detection.data.script
    };
  } catch (error) {
    console.error('Processing failed:', error);
    throw error;
  }
}

// Usage
const document = await processDocument('scanned-page.jpg');
console.log(document.text);

Multi-Language Document Analysis

import { recognize, detect, languages } from 'tesseract.js';

async function analyzeMultilingualDocument(imagePath) {
  // Detect script first
  const detection = await detect(imagePath);
  console.log(`Detected script: ${detection.data.script}`);
  
  // Choose appropriate languages based on script
  let langs = 'eng'; // default
  if (detection.data.script === 'Latin') {
    langs = 'eng+fra+deu+spa'; // Common Latin script languages
  } else if (detection.data.script === 'Han') {
    langs = 'chi_sim+chi_tra'; // Chinese variants
  }
  
  // Recognize with appropriate languages
  const result = await recognize(imagePath, langs);
  
  return {
    detectedScript: detection.data.script,
    usedLanguages: langs,
    text: result.data.text,
    confidence: result.data.confidence
  };
}

// Usage
const analysis = await analyzeMultilingualDocument('multilingual-doc.png');
console.log(`Script: ${analysis.detectedScript}`);
console.log(`Languages used: ${analysis.usedLanguages}`);
console.log(`Text: ${analysis.text}`);

Batch Processing with High-Level Functions

import { recognize } from 'tesseract.js';

async function processBatch(imagePaths, concurrency = 3) {
  const results = [];
  
  // Process in chunks to avoid overwhelming the system
  for (let i = 0; i < imagePaths.length; i += concurrency) {
    const chunk = imagePaths.slice(i, i + concurrency);
    
    console.log(`Processing chunk ${Math.floor(i/concurrency) + 1}/${Math.ceil(imagePaths.length/concurrency)}`);
    
    const chunkResults = await Promise.all(
      chunk.map(async (imagePath, index) => {
        try {
          const result = await recognize(imagePath, 'eng', {
            logger: m => console.log(`Image ${i + index + 1}: ${m.status} - ${m.progress}%`)
          });
          return { imagePath, text: result.data.text, success: true };
        } catch (error) {
          console.error(`Failed to process ${imagePath}:`, error);
          return { imagePath, error: error.message, success: false };
        }
      })
    );
    
    results.push(...chunkResults);
  }
  
  return results;
}

// Usage
const imagePaths = ['doc1.jpg', 'doc2.png', 'doc3.pdf', 'doc4.jpg'];
const results = await processBatch(imagePaths);

const successful = results.filter(r => r.success);
const failed = results.filter(r => !r.success);

console.log(`Successfully processed: ${successful.length}`);
console.log(`Failed: ${failed.length}`);

Error Handling Best Practices

import { recognize, detect } from 'tesseract.js';

async function robustOCR(imagePath, retries = 2) {
  for (let attempt = 1; attempt <= retries + 1; attempt++) {
    try {
      console.log(`Attempt ${attempt} for ${imagePath}`);
      
      const result = await recognize(imagePath, 'eng', {
        logger: m => {
          if (m.status === 'recognizing text') {
            console.log(`Progress: ${m.progress}%`);
          }
        },
        errorHandler: (error) => {
          console.warn(`Worker warning:`, error);
        }
      });
      
      if (result.data.confidence < 50) {
        console.warn(`Low confidence (${result.data.confidence}%) for ${imagePath}`);
      }
      
      return result;
      
    } catch (error) {
      console.error(`Attempt ${attempt} failed:`, error.message);
      
      if (attempt === retries + 1) {
        throw new Error(`OCR failed after ${retries + 1} attempts: ${error.message}`);
      }
      
      // Wait before retry
      await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
    }
  }
}

// Usage with error handling
try {
  const result = await robustOCR('difficult-image.jpg');
  console.log('OCR successful:', result.data.text);
} catch (error) {
  console.error('OCR ultimately failed:', error.message);
}

Result Types

The high-level functions return the same result types as the Worker API:

interface RecognizeResult {
  jobId: string;
  data: Page;
}

interface DetectResult {
  jobId: string;
  data: DetectData;
}

interface DetectData {
  tesseract_script_id: number | null;
  script: string | null;
  script_confidence: number | null;
  orientation_degrees: number | null;
  orientation_confidence: number | null;
}

Install with Tessl CLI