CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-bytebuffer

The swiss army knife for binary data in JavaScript

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

string-operations.mddocs/

String Operations

Read and write operations for strings with multiple encoding formats and length-prefixed variants. ByteBuffer provides comprehensive string handling with support for null-terminated strings, length-prefixed strings, and various character encodings.

Capabilities

UTF-8 String Operations

Read and write UTF-8 encoded strings with various length encoding schemes.

/**
 * Write UTF-8 encoded string without length prefix
 * @param {string} str - String to write
 * @param {number} offset - Offset to write at (default: current offset)
 * @returns {ByteBuffer} This ByteBuffer for chaining
 */
writeUTF8String(str, offset);

/**
 * Alias for writeUTF8String
 */
writeString(str, offset);

/**
 * Read UTF-8 encoded string
 * @param {number} length - Number of bytes (if metrics='b') or characters (if metrics='c') to read
 * @param {string} metrics - Length interpretation: 'b' for bytes, 'c' for characters (default: 'b')
 * @param {number} offset - Offset to read from (default: current offset)
 * @returns {string|{string: string, offset: number}} String value or result object
 */
readUTF8String(length, metrics, offset);

/**
 * Alias for readUTF8String
 */
readString(length, metrics, offset);

Usage Examples:

const ByteBuffer = require("bytebuffer");
const bb = ByteBuffer.allocate(64);

// Write UTF-8 strings
bb.writeUTF8String("Hello World!");
bb.writeString("Café"); // UTF-8 handles accented characters
bb.writeUTF8String("🚀 Emoji"); // UTF-8 handles emoji

// Calculate string sizes for reading
const hello = "Hello World!";
const helloBytes = ByteBuffer.calculateUTF8Bytes(hello);
const helloChars = ByteBuffer.calculateUTF8Chars(hello);

console.log(`"${hello}": ${helloBytes} bytes, ${helloChars} chars`);

// Read back using byte count
bb.flip();
const readHello = bb.readUTF8String(12); // 12 bytes
const readCafe = bb.readString(5);        // 5 bytes (é is 2 bytes in UTF-8)
const readEmoji = bb.readUTF8String(10);  // 10 bytes (🚀 is 4 bytes)

console.log(readHello); // "Hello World!"
console.log(readCafe);  // "Café"
console.log(readEmoji); // "🚀 Emoji"

// Read using character count
bb.clear();
bb.writeUTF8String("Café");
bb.flip();
const cafeByChars = bb.readUTF8String(4, 'c'); // 4 characters
console.log(cafeByChars); // "Café"

C-Style Null-Terminated Strings

Read and write null-terminated strings (C-style strings).

/**
 * Write null-terminated string (C-style string)
 * @param {string} str - String to write (null terminator added automatically)
 * @param {number} offset - Offset to write at (default: current offset)
 * @returns {ByteBuffer} This ByteBuffer for chaining
 */
writeCString(str, offset);

/**
 * Read null-terminated string (C-style string)
 * @param {number} offset - Offset to read from (default: current offset)
 * @returns {string|{string: string, offset: number}} String value or result object
 */
readCString(offset);

Usage Examples:

const bb = ByteBuffer.allocate(64);

// Write C-style strings (null-terminated)
bb.writeCString("Hello");
bb.writeCString("World");
bb.writeCString(""); // Empty string

// Read back - automatically stops at null terminator
bb.flip();
const str1 = bb.readCString(); // "Hello"
const str2 = bb.readCString(); // "World" 
const str3 = bb.readCString(); // ""

console.log(`Read: "${str1}", "${str2}", "${str3}"`);

// C-strings include null terminator in byte count
bb.clear();
bb.writeCString("Test");
console.log(bb.offset); // 5 (4 chars + 1 null terminator)

Length-Prefixed Strings

Read and write strings with various length-prefix encodings.

/**
 * Write string with 32-bit unsigned integer length prefix
 * @param {string} str - String to write
 * @param {number} offset - Offset to write at (default: current offset)
 * @returns {ByteBuffer} This ByteBuffer for chaining
 */
writeIString(str, offset);

/**
 * Read string with 32-bit unsigned integer length prefix
 * @param {number} offset - Offset to read from (default: current offset)
 * @returns {string|{string: string, offset: number}} String value or result object
 */
readIString(offset);

/**
 * Write string with varint32 length prefix
 * @param {string} str - String to write
 * @param {number} offset - Offset to write at (default: current offset)
 * @returns {ByteBuffer} This ByteBuffer for chaining
 */
writeVString(str, offset);

/**
 * Read string with varint32 length prefix
 * @param {number} offset - Offset to read from (default: current offset)
 * @returns {string|{string: string, offset: number}} String value or result object
 */
readVString(offset);

Usage Examples:

const bb = ByteBuffer.allocate(128);

// IString - uses 4-byte uint32 length prefix
bb.writeIString("Hello World!");
bb.writeIString("Short");
bb.writeIString("");

// VString - uses varint32 length prefix (more space-efficient)
bb.writeVString("Hello World!"); 
bb.writeVString("Short");
bb.writeVString("");

// Read back IStrings
bb.flip();
const istr1 = bb.readIString(); // "Hello World!"
const istr2 = bb.readIString(); // "Short"
const istr3 = bb.readIString(); // ""

// Read back VStrings  
const vstr1 = bb.readVString(); // "Hello World!"
const vstr2 = bb.readVString(); // "Short"
const vstr3 = bb.readVString(); // ""

console.log("IStrings:", istr1, istr2, istr3);
console.log("VStrings:", vstr1, vstr2, vstr3);

// Compare space usage
bb.clear();
bb.writeIString("Hi");  // 4 bytes (length) + 2 bytes (data) = 6 bytes
const iStringSize = bb.offset;

bb.clear();
bb.writeVString("Hi");  // 1 byte (length) + 2 bytes (data) = 3 bytes
const vStringSize = bb.offset;

console.log(`IString: ${iStringSize} bytes, VString: ${vStringSize} bytes`);

String Calculation Utilities

Calculate the number of bytes and characters needed for UTF-8 string encoding.

/**
 * Calculate number of bytes required to encode string as UTF-8
 * @param {string} str - String to calculate for
 * @returns {number} Number of bytes required
 */
ByteBuffer.calculateUTF8Bytes(str);

/**
 * Alias for calculateUTF8Bytes
 */
ByteBuffer.calculateString(str);

/**
 * Calculate number of characters in UTF-8 string
 * @param {string} str - String to calculate for
 * @returns {number} Number of Unicode characters
 */
ByteBuffer.calculateUTF8Chars(str);

Usage Examples:

// Test various strings
const testStrings = [
    "Hello",           // ASCII characters
    "Café",            // Latin characters with accents
    "日本語",           // CJK characters  
    "🚀🌟✨",           // Emoji
    "𝓗𝓮𝓵𝓵𝓸",           // Mathematical script characters
];

testStrings.forEach(str => {
    const bytes = ByteBuffer.calculateUTF8Bytes(str);
    const chars = ByteBuffer.calculateUTF8Chars(str);
    const jsLength = str.length; // JavaScript string length (UTF-16 code units)
    
    console.log(`"${str}":
        UTF-8 bytes: ${bytes}
        UTF-8 chars: ${chars} 
        JS length: ${jsLength}`);
});

// Output shows differences between byte count, character count, and JS length:
// "Hello": 5 bytes, 5 chars, 5 JS length
// "Café": 5 bytes, 4 chars, 4 JS length  
// "日本語": 9 bytes, 3 chars, 3 JS length
// "🚀🌟✨": 12 bytes, 3 chars, 6 JS length (emoji are surrogate pairs in JS)
// "𝓗𝓮𝓵𝓵𝓸": 20 bytes, 5 chars, 10 JS length (math script uses surrogate pairs)

String Metrics Constants

Constants for specifying how string lengths should be interpreted.

/**
 * Character-based metrics - interpret length as number of Unicode characters
 */
ByteBuffer.METRICS_CHARS = 'c';

/**
 * Byte-based metrics - interpret length as number of UTF-8 bytes
 */
ByteBuffer.METRICS_BYTES = 'b';

Usage Examples:

const bb = ByteBuffer.allocate(32);
const testString = "Café"; // 4 chars, 5 bytes in UTF-8

bb.writeUTF8String(testString);
bb.flip();

// Read by byte count
const byBytes = bb.readUTF8String(5, ByteBuffer.METRICS_BYTES);
console.log(byBytes); // "Café"

bb.offset = 0; // Reset for second read

// Read by character count  
const byChars = bb.readUTF8String(4, ByteBuffer.METRICS_CHARS);
console.log(byChars); // "Café"

// Demonstrate the difference with emoji
bb.clear();
const emojiString = "Hi🚀"; // 3 chars, 6 bytes
bb.writeUTF8String(emojiString);
bb.flip();

const emojiByBytes = bb.readUTF8String(6, 'b'); // 6 bytes
const emojiByChars = bb.readUTF8String(3, 'c'); // 3 characters

// Both should be "Hi🚀" but read using different metrics

Advanced String Operations

Working with different encodings and string manipulation.

Usage Examples:

const bb = ByteBuffer.allocate(128);

// Write mixed content
bb.writeUTF8String("Start: ");
bb.writeIString("Middle part");
bb.writeCString("End");

// Chain string operations
bb.clear()
  .writeVString("First")
  .writeVString("Second") 
  .writeVString("Third");

// Read back in order
bb.flip();
const first = bb.readVString();   // "First"
const second = bb.readVString();  // "Second"
const third = bb.readVString();   // "Third"

// Working with large strings
const largeString = "x".repeat(10000);
bb.clear();
bb.ensureCapacity(ByteBuffer.calculateUTF8Bytes(largeString) + 10);
bb.writeVString(largeString);
bb.flip();
const readLarge = bb.readVString();
console.log(readLarge.length); // 10000

// Empty string handling
bb.clear();
bb.writeCString("");    // Just null terminator
bb.writeIString("");    // 4-byte zero length + no data
bb.writeVString("");    // 1-byte zero length + no data

bb.flip();
console.log(`"${bb.readCString()}"`);  // ""
console.log(`"${bb.readIString()}"`);  // "" 
console.log(`"${bb.readVString()}"`);  // ""

Error Handling

String operations may encounter the following error conditions:

  • Error: When attempting to read beyond buffer limits
  • Error: When invalid UTF-8 sequences are encountered
  • RangeError: When string is too large for available buffer space
  • TypeError: When non-string values are provided to string write methods

Example Error Handling:

const bb = ByteBuffer.allocate(16);

try {
    // This may throw RangeError if string is too large
    const largeString = "x".repeat(1000);
    bb.writeUTF8String(largeString);
} catch (error) {
    console.error("String too large:", error.message);
}

try {
    // This will throw Error when reading beyond buffer
    bb.readUTF8String(100); // Trying to read 100 bytes from small buffer
} catch (error) {
    console.error("Read beyond buffer:", error.message);
}

try {
    // This will throw TypeError
    bb.writeUTF8String(123); // Number instead of string
} catch (error) {
    console.error("Invalid type:", error.message);
}

// Handle null terminator edge cases
bb.clear();
bb.writeUTF8String("Test\0with\0nulls");
bb.flip();
const withNulls = bb.readCString(); // Stops at first null
console.log(`C-string: "${withNulls}"`); // "Test" (stops at \0)

Performance Considerations

  • UTF-8 encoding/decoding has computational overhead compared to ASCII
  • VString is more space-efficient than IString for short strings
  • CString is space-efficient but requires null-termination scanning
  • Character vs byte metrics - byte metrics are faster as they avoid UTF-8 character counting
  • Large strings may require buffer resizing, which involves memory allocation

Performance Comparison:

// Space efficiency comparison for short strings
const shortString = "Hi";

// Method 1: C-String (3 bytes: 'H', 'i', '\0')
// Method 2: VString (3 bytes: 1-byte length + 'H', 'i') 
// Method 3: IString (6 bytes: 4-byte length + 'H', 'i')

// For longer strings, the length prefix overhead becomes negligible
const longString = "x".repeat(1000);
// C-String: 1001 bytes
// VString: 1002 bytes (1-byte varint length + 1000 data)
// IString: 1004 bytes (4-byte length + 1000 data)

Install with Tessl CLI

npx tessl i tessl/npm-bytebuffer

docs

buffer-conversion.md

buffer-management.md

floating-point-operations.md

index.md

integer-operations.md

string-operations.md

varint-operations.md

tile.json