or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

buffer-creation.mdbuffer-manipulation.mddata-read-write.mdindex.mdstring-encoding.md
tile.json

string-encoding.mddocs/

String and Encoding Operations

Text encoding/decoding and string conversion with comprehensive support for multiple character encodings. Essential for text processing, protocol handling, and data interchange.

Capabilities

String Conversion

Convert buffer contents to strings using various character encodings.

/**
 * Convert buffer to string using specified encoding
 * @param encoding - Character encoding (default: 'utf8')
 * @param start - Start offset for conversion (default: 0)
 * @param end - End offset for conversion (default: buffer.length)
 * @returns Decoded string
 */
toString(encoding?: string, start?: number, end?: number): string;

/**
 * Alias for toString() method
 * @param encoding - Character encoding (default: 'utf8')
 * @param start - Start offset for conversion (default: 0)
 * @param end - End offset for conversion (default: buffer.length)
 * @returns Decoded string
 */
toLocaleString(encoding?: string, start?: number, end?: number): string;

Usage Examples:

const buf = Buffer.from([0x48, 0x65, 0x6c, 0x6c, 0x6f]); // "Hello"

// Default UTF-8 conversion
const str1 = buf.toString(); // "Hello"

// Specific encoding
const str2 = buf.toString('ascii'); // "Hello"

// Partial conversion
const str3 = buf.toString('utf8', 0, 3); // "Hel"

// Hex representation
const hexBuf = Buffer.from('Hello');
const hex = hexBuf.toString('hex'); // "48656c6c6f"

// Base64 representation
const base64 = hexBuf.toString('base64'); // "SGVsbG8="

String Writing

Write strings to buffer with encoding support.

/**
 * Write string to buffer at specified offset
 * @param string - String to write
 * @param offset - Byte offset to start writing (default: 0)
 * @param length - Maximum bytes to write (default: remaining buffer)
 * @param encoding - Character encoding (default: 'utf8')
 * @returns Number of bytes written
 */
write(string: string, offset?: number, length?: number, encoding?: string): number;

Usage Examples:

const buf = Buffer.alloc(20);

// Basic string writing
const bytesWritten1 = buf.write('Hello'); // Writes "Hello" at offset 0

// Writing with offset
const bytesWritten2 = buf.write(' World', 5); // Writes " World" at offset 5

// Writing with length limit
const buf2 = Buffer.alloc(3);
const bytesWritten3 = buf2.write('Hello', 0, 3); // Only writes "Hel"

// Writing with encoding
const buf3 = Buffer.alloc(10);
const bytesWritten4 = buf3.write('48656c6c6f', 0, 10, 'hex'); // Writes decoded hex

// Different parameter combinations
buf.write('test');                    // write(string)
buf.write('test', 'utf8');           // write(string, encoding)
buf.write('test', 0);                // write(string, offset)
buf.write('test', 0, 4);             // write(string, offset, length)  
buf.write('test', 0, 4, 'utf8');     // write(string, offset, length, encoding)

Encoding Utilities

Utility functions for working with character encodings and byte length calculations.

/**
 * Check if encoding is supported
 * @param encoding - Encoding name to check
 * @returns True if encoding is supported
 */
Buffer.isEncoding(encoding: string): boolean;

/**
 * Get byte length of string when encoded
 * @param string - String to measure
 * @param encoding - Character encoding (default: 'utf8')
 * @returns Number of bytes required
 */
Buffer.byteLength(string: string, encoding?: string): number;

Usage Examples:

// Check encoding support
console.log(Buffer.isEncoding('utf8'));    // true
console.log(Buffer.isEncoding('ascii'));   // true
console.log(Buffer.isEncoding('invalid')); // false

// Calculate byte lengths
console.log(Buffer.byteLength('Hello'));           // 5 (UTF-8)
console.log(Buffer.byteLength('Hello', 'ascii'));  // 5 (ASCII)
console.log(Buffer.byteLength('🚀'));              // 4 (UTF-8 emoji)
console.log(Buffer.byteLength('FF', 'hex'));       // 1 (hex decoding)

// Unicode examples
console.log(Buffer.byteLength('café'));            // 5 (é is 2 bytes in UTF-8)
console.log(Buffer.byteLength('café', 'ascii'));   // 4 (é becomes single byte)

JSON Serialization

Convert buffer to JSON format for serialization.

/**
 * Convert buffer to JSON representation
 * @returns Object with type and data properties
 */
toJSON(): { type: 'Buffer', data: number[] };

Usage Examples:

const buf = Buffer.from('Hello');
const json = buf.toJSON();
// Result: { type: 'Buffer', data: [72, 101, 108, 108, 111] }

// Serialization roundtrip
const jsonString = JSON.stringify(buf);
const parsed = JSON.parse(jsonString);
const restored = Buffer.from(parsed.data);
console.log(restored.toString()); // "Hello"

Supported Encodings

Standard Text Encodings

// ASCII encoding (7-bit, characters 0-127)
'ascii'

// UTF-8 encoding (variable width, 1-4 bytes per character)
'utf8'
'utf-8'

// Latin1/ISO-8859-1 encoding (8-bit, characters 0-255)
'latin1'
'binary' // Alias for latin1

// UTF-16 Little Endian (2 or 4 bytes per character)
'utf16le'
'utf-16le'
'ucs2'    // Alias for utf16le
'ucs-2'   // Alias for utf16le

Binary Encodings

// Hexadecimal encoding (2 hex digits per byte)
'hex'

// Base64 encoding (4 characters per 3 bytes)
'base64'

Encoding Examples:

const text = 'Hello 🌍';

// UTF-8 (default)
const utf8Buf = Buffer.from(text, 'utf8');
console.log(utf8Buf.length); // 10 bytes (emoji is 4 bytes)

// ASCII (non-ASCII chars become question marks)
const asciiBuf = Buffer.from(text, 'ascii');
console.log(asciiBuf.toString('ascii')); // "Hello ?"

// Hex encoding
const hexBuf = Buffer.from('48656c6c6f', 'hex');
console.log(hexBuf.toString()); // "Hello"

// Base64 encoding
const base64Buf = Buffer.from('SGVsbG8=', 'base64');
console.log(base64Buf.toString()); // "Hello"

// UTF-16 Little Endian
const utf16Buf = Buffer.from('Hello', 'utf16le');
console.log(utf16Buf); // <Buffer 48 00 65 00 6c 00 6c 00 6f 00>

Encoding Behavior Details

UTF-8 Handling

UTF-8 is the default encoding and handles full Unicode support:

// Multi-byte characters
const buf1 = Buffer.from('café'); // [99, 97, 102, 195, 169]
const buf2 = Buffer.from('🚀');   // [240, 159, 154, 128]

// Partial UTF-8 sequences
const partialBuf = Buffer.from([0xC3]); // Incomplete UTF-8
console.log(partialBuf.toString()); // Replacement character �

Hex Encoding

Hexadecimal encoding converts each byte to two hex digits:

// String to hex
const buf = Buffer.from('Hello');
console.log(buf.toString('hex')); // "48656c6c6f"

// Hex to buffer
const hexBuf = Buffer.from('48656c6c6f', 'hex');
console.log(hexBuf.toString()); // "Hello"

// Invalid hex characters are ignored
const invalidHex = Buffer.from('48656c6c6g', 'hex');
console.log(invalidHex.toString()); // "Hell" (stops at invalid 'g')

Base64 Encoding

Base64 encoding with automatic padding:

// String to base64
const buf = Buffer.from('Hello World');
console.log(buf.toString('base64')); // "SGVsbG8gV29ybGQ="

// Base64 to buffer
const base64Buf = Buffer.from('SGVsbG8gV29ybGQ=', 'base64');
console.log(base64Buf.toString()); // "Hello World"

// Missing padding is handled automatically
const noPadding = Buffer.from('SGVsbG8', 'base64');
console.log(noPadding.toString()); // "Hello"

Error Handling

String and encoding operations may encounter the following errors:

  • TypeError: When encoding parameter is not a string
  • TypeError: When string parameter is not a string
  • RangeError: When offset/length parameters are out of bounds
  • TypeError: For unknown/unsupported encodings
const buf = Buffer.alloc(10);

try {
  buf.write('Hello', -1); // Throws RangeError
} catch (error) {
  console.error('Write error:', error.message);
}

try {
  Buffer.from('Hello', 'invalid-encoding'); // Throws TypeError
} catch (error) {
  console.error('Encoding error:', error.message);
}

Performance Considerations

  • UTF-8: Default encoding, well-optimized
  • ASCII: Fastest for 7-bit text
  • Hex: Good for debugging and protocols
  • Base64: Standard for data encoding, moderate overhead
  • Binary/Latin1: Fast for 8-bit data that isn't text
// Performance tip: Pre-calculate byte length for large strings
const largeString = 'very long string...';
const byteLength = Buffer.byteLength(largeString);
const buf = Buffer.allocUnsafe(byteLength);
buf.write(largeString);