or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated

docs

index.md
tile.json

tessl/github-zxing-cpp--zxing-cpp

tessl install tessl/github-zxing-cpp--zxing-cpp@2.3.0

Open-source, multi-format linear/matrix barcode image processing library implemented in C++

encodings.mddocs/reference/

Character Encodings

ZXing-C++ supports multiple character encodings for barcode content. The CharacterSet enum provides encoding identification, and ECI (Extended Channel Interpretation) enables encoding switches within barcode data.

CharacterSet Enum

enum class CharacterSet : unsigned char {
    Unknown = 0,

    // ASCII and Latin
    ASCII,
    ISO8859_1,   // Latin-1 (Western European)
    ISO8859_2,   // Latin-2 (Central European)
    ISO8859_3,   // Latin-3 (South European)
    ISO8859_4,   // Latin-4 (North European)
    ISO8859_5,   // Cyrillic
    ISO8859_6,   // Arabic
    ISO8859_7,   // Greek
    ISO8859_8,   // Hebrew
    ISO8859_9,   // Latin-5 (Turkish)
    ISO8859_10,  // Latin-6 (Nordic)
    ISO8859_11,  // Thai
    ISO8859_13,  // Latin-7 (Baltic Rim)
    ISO8859_14,  // Latin-8 (Celtic)
    ISO8859_15,  // Latin-9 (Western European with Euro)
    ISO8859_16,  // Latin-10 (South-Eastern European)

    // Code Pages
    Cp437,       // DOS Latin US
    Cp1250,      // Windows Central European
    Cp1251,      // Windows Cyrillic
    Cp1252,      // Windows Western European
    Cp1256,      // Windows Arabic

    // Asian Encodings
    Shift_JIS,   // Japanese (Shift JIS)
    Big5,        // Traditional Chinese
    GB2312,      // Simplified Chinese
    GB18030,     // Chinese National Standard
    EUC_JP,      // Japanese (EUC-JP)
    EUC_KR,      // Korean (EUC-KR)

    // Unicode
    UTF8,
    UTF16BE,     // UTF-16 Big Endian
    UTF16LE,     // UTF-16 Little Endian
    UTF32BE,     // UTF-32 Big Endian
    UTF32LE,     // UTF-32 Little Endian
    UnicodeBig,  // [Deprecated, use UTF16BE]

    // Special
    BINARY,      // Binary data (no text encoding)

    CharsetCount // Number of character sets
};

Working with Character Sets

Parsing from String

// Parse character set name
ZXing::CharacterSet cs = ZXing::CharacterSetFromString("UTF-8");

if (cs == ZXing::CharacterSet::UTF8) {
    std::cout << "Parsed UTF-8\n";
}

// Various name formats are supported
ZXing::CharacterSet cs1 = ZXing::CharacterSetFromString("UTF8");
ZXing::CharacterSet cs2 = ZXing::CharacterSetFromString("utf-8");
ZXing::CharacterSet cs3 = ZXing::CharacterSetFromString("UTF_8");
// All parse to CharacterSet::UTF8

// Unknown names return Unknown
ZXing::CharacterSet unknown = ZXing::CharacterSetFromString("INVALID");
assert(unknown == ZXing::CharacterSet::Unknown);

Converting to String

ZXing::CharacterSet cs = ZXing::CharacterSet::UTF8;
std::string name = ZXing::ToString(cs);
std::cout << "Encoding: " << name << "\n";  // Output: "UTF8"

Setting Fallback Encoding

When a barcode doesn't specify encoding, use fallback:

// Set fallback character set in reader options
auto options = ZXing::ReaderOptions()
    .setCharacterSet(ZXing::CharacterSet::ISO8859_1);

// Or set from string
options.setCharacterSet("ISO-8859-1");

auto barcode = ZXing::ReadBarcode(image, options);

// If barcode has no ECI, fallback encoding is used
std::string text = barcode.text();

ECI (Extended Channel Interpretation)

ECI enables encoding switches within barcode data. QR codes and other 2D formats can embed ECI values to specify character encodings.

Checking for ECI

auto barcode = ZXing::ReadBarcode(image);

if (barcode.hasECI()) {
    std::cout << "Barcode uses ECI encoding\n";

    // Get bytes following ECI protocol
    ZXing::ByteArray eciBytes = barcode.bytesECI();

    // Get text with ECI preservation
    std::string eciText = barcode.text(ZXing::TextMode::ECI);
} else {
    std::cout << "No ECI - using default/guessed encoding\n";
}

ECI Text Mode

The ECI text mode preserves encoding information:

// Get text following ECI protocol
std::string eciText = barcode.text(ZXing::TextMode::ECI);

// ECI format: segments with encoding headers
// Example: "\ECI000026Hello\ECI000003こんにちは"

ECI text format:

  • \ECInnnnnn headers indicate encoding switches
  • nnnnnn is the 6-digit ECI value (e.g., 000026 for UTF-8)
  • Following text uses that encoding until next ECI header

Common ECI Values

// Common ECI assignments
// ECI 000003 = ISO-8859-1
// ECI 000020 = Shift_JIS (Japanese)
// ECI 000026 = UTF-8
// ECI 000029 = GB2312 (Simplified Chinese)
// ECI 000030 = Big5 (Traditional Chinese)

Text Rendering Modes

Different text modes handle encoding differently:

Plain Mode (Default)

// Plain mode: decode to Unicode string
std::string text = barcode.text(ZXing::TextMode::Plain);

// Uses ECI if present, otherwise:
// 1. Try UTF-8
// 2. Try fallback encoding from ReaderOptions
// 3. Guess encoding from content

ECI Mode

// ECI mode: preserve encoding headers
std::string eciText = barcode.text(ZXing::TextMode::ECI);

// Includes \ECInnnnnn headers for encoding switches
// Useful for:
// - Re-encoding the barcode
// - Processing multi-encoding content
// - Debugging encoding issues

HRI Mode

// HRI mode: Human Readable Interpretation
std::string hri = barcode.text(ZXing::TextMode::HRI);

// Application-specific interpretation
// May differ from raw content for GS1, structured data, etc.

Hex Mode

// Hex mode: raw bytes as hex string
std::string hex = barcode.text(ZXing::TextMode::Hex);

// Encoding-independent byte representation
// Example: "48656C6C6F" for "Hello"

Escaped Mode

// Escaped mode: escape non-graphical characters
std::string escaped = barcode.text(ZXing::TextMode::Escaped);

// Control characters shown as <XX>
// Example: "Hello<GS>World" for data with GS separator

Encoding Detection

Automatic Detection

Without ECI, ZXing attempts to detect encoding:

auto options = ZXing::ReaderOptions();
// No character set specified

auto barcode = ZXing::ReadBarcode(image, options);

// Encoding is automatically detected/guessed
std::string text = barcode.text();

Detection heuristics:

  1. Check for UTF-8 byte patterns
  2. Check for common single-byte encodings
  3. Use statistical analysis for Asian encodings
  4. Fall back to ISO-8859-1

Explicit Fallback

Specify encoding when automatic detection fails:

// For Japanese content without ECI
auto options = ZXing::ReaderOptions()
    .setCharacterSet(ZXing::CharacterSet::Shift_JIS);

auto barcode = ZXing::ReadBarcode(image, options);
std::string text = barcode.text();  // Decoded as Shift_JIS

Binary Content

Detecting Binary Data

if (barcode.contentType() == ZXing::ContentType::Binary) {
    std::cout << "Binary content detected\n";

    // Get raw bytes
    const ZXing::ByteArray& bytes = barcode.bytes();

    // Process as binary
    processBytes(bytes.data(), bytes.size());
}

Forcing Binary Mode

// Always treat as binary - don't decode to text
const ZXing::ByteArray& bytes = barcode.bytes();

// Or get hex representation
std::string hex = barcode.text(ZXing::TextMode::Hex);

Encoding by Format

QR Code

QR codes support:

  • Full ECI support for encoding switches
  • Default: ISO-8859-1 or Shift_JIS (depending on content)
  • Automatic mode switching for efficiency
if (barcode.format() == ZXing::BarcodeFormat::QRCode) {
    if (barcode.hasECI()) {
        // Encoding explicitly specified
    } else {
        // ISO-8859-1 or Shift_JIS based on content
    }
}

Data Matrix

Data Matrix supports:

  • Full ECI support
  • Default: ISO-8859-1
if (barcode.format() == ZXing::BarcodeFormat::DataMatrix) {
    // Similar ECI handling as QR Code
}

PDF417

PDF417 supports:

  • ECI for encoding specification
  • Default: Cp437 (DOS Latin)

Aztec

Aztec supports:

  • ECI for encoding
  • Default: ISO-8859-1

Linear Barcodes

Linear barcodes typically:

  • No ECI support
  • Limited character sets (numeric, alphanumeric)
  • ASCII or ISO-8859-1 for alphanumeric formats
// Code 128 can encode full ASCII
if (barcode.format() == ZXing::BarcodeFormat::Code128) {
    // ASCII encoding
    std::string text = barcode.text();
}

// EAN/UPC encode only digits
if (barcode.format() == ZXing::BarcodeFormat::EAN13) {
    // Numeric only
    std::string digits = barcode.text();
}

Multi-Language Content

Single Barcode, Multiple Languages

With ECI, a single barcode can contain multiple languages:

// QR code with English and Japanese
// Content has ECI switches for each language segment

std::string eciText = barcode.text(ZXing::TextMode::ECI);
// Example: "\ECI000026Hello\ECI000020こんにちは"

// Plain mode attempts unified decoding
std::string plainText = barcode.text(ZXing::TextMode::Plain);
// Example: "Helloこんにちは" (if supported by runtime encoding)

Handling Encoding Errors

auto barcode = ZXing::ReadBarcode(image);

if (barcode.contentType() == ZXing::ContentType::UnknownECI) {
    std::cerr << "Unknown ECI value encountered\n";

    // Try raw bytes
    const ZXing::ByteArray& bytes = barcode.bytes();

    // Or hex representation
    std::string hex = barcode.text(ZXing::TextMode::Hex);
}

Best Practices

UTF-8 as Default

UTF-8 handles most modern use cases:

// Prefer UTF-8 for new applications
auto options = ZXing::ReaderOptions()
    .setCharacterSet(ZXing::CharacterSet::UTF8);

Check Content Type

switch (barcode.contentType()) {
    case ZXing::ContentType::Text:
        // Safe to use as text
        std::string text = barcode.text();
        break;

    case ZXing::ContentType::Binary:
        // Treat as binary data
        processBytes(barcode.bytes());
        break;

    case ZXing::ContentType::GS1:
        // GS1 formatted data
        std::string gs1 = barcode.text();
        break;

    case ZXing::ContentType::Mixed:
        // Mixed text/binary - be careful
        break;

    case ZXing::ContentType::UnknownECI:
        // Unknown encoding - handle gracefully
        std::string hex = barcode.text(ZXing::TextMode::Hex);
        break;
}

Preserve Original Encoding

For re-encoding or processing:

// Get original bytes
const ZXing::ByteArray& originalBytes = barcode.bytes();

// Or ECI bytes if applicable
ZXing::ByteArray eciBytes = barcode.bytesECI();

// Store for later use
storeOriginal(originalBytes);

Handle Non-Text Content

// Check if content is text before processing as string
if (barcode.contentType() != ZXing::ContentType::Binary) {
    std::string text = barcode.text();
    processText(text);
} else {
    const ZXing::ByteArray& bytes = barcode.bytes();
    processBinary(bytes);
}

Creating Encoded Barcodes

When using the experimental writing API:

#ifdef ZXING_EXPERIMENTAL_API
// UTF-8 text (recommended)
ZXing::CreatorOptions options(ZXing::BarcodeFormat::QRCode);
auto barcode = ZXing::CreateBarcodeFromText(u8"Hello 世界", options);

// Binary data
std::vector<uint8_t> data = {0x01, 0x02, 0x03, 0x04};
auto barcode2 = ZXing::CreateBarcodeFromBytes(
    data.data(), data.size(), options);
#endif

Related

  • Barcode Results - Accessing encoded content
  • Barcode Reading - Configuring encoding fallback
  • Reader Options - Setting character set options
  • Errors - Handling encoding errors