or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-unsafe_2.13@3.5.x

docs

array-operations.mdbitset-operations.mdbyte-array-utilities.mddata-types-utilities.mdhash-functions.mdindex.mdkv-iterator.mdmemory-management.mdplatform-operations.mdutf8-string-processing.md
tile.json

tessl/maven-org-apache-spark--spark-unsafe_2-13

tessl install tessl/maven-org-apache-spark--spark-unsafe_2-13@3.5.0

Low-level unsafe operations and optimized data structures for Apache Spark's internal memory management and performance-critical operations.

byte-array-utilities.mddocs/

Byte Array Utilities

High-performance utilities for byte array operations including memory copying, binary comparison, string manipulation, and padding operations. Optimized for internal Spark operations with platform-specific memory access patterns.

Capabilities

Memory Operations

Write byte array content directly to memory addresses for zero-copy operations.

/**
 * Writes the content of a byte array into a memory address
 * @param src the source byte array to copy from
 * @param target the target object containing the destination address (or null for absolute addresses)
 * @param targetOffset the offset in the target object where data should be written
 */
public static void writeToMemory(byte[] src, Object target, long targetOffset);

Usage Example:

import org.apache.spark.unsafe.types.ByteArray;
import org.apache.spark.unsafe.Platform;

byte[] data = "Hello World".getBytes();
long address = Platform.allocateMemory(data.length);

// Copy byte array to allocated memory
ByteArray.writeToMemory(data, null, address);

// Clean up
Platform.freeMemory(address);

Prefix Generation

Generate 64-bit integer prefixes for efficient sorting operations.

/**
 * Returns a 64-bit integer that can be used as the prefix for sorting
 * @param bytes the byte array to generate prefix from
 * @return 64-bit prefix value optimized for sorting
 */
public static long getPrefix(byte[] bytes);

Usage Example:

byte[] data1 = "apple".getBytes();
byte[] data2 = "banana".getBytes();

long prefix1 = ByteArray.getPrefix(data1);
long prefix2 = ByteArray.getPrefix(data2);

// Use prefixes for fast comparison
if (prefix1 != prefix2) {
    int comparison = Long.compareUnsigned(prefix1, prefix2);
    // prefix comparison result
}

Binary Comparison

High-performance binary comparison of byte arrays with platform-optimized algorithms.

/**
 * Compares two byte arrays lexicographically using optimized binary comparison
 * @param leftBase the first byte array to compare
 * @param rightBase the second byte array to compare
 * @return negative integer if left < right, zero if equal, positive if left > right
 */
public static int compareBinary(byte[] leftBase, byte[] rightBase);

String Manipulation

SQL-compatible substring extraction with position-based indexing.

/**
 * Extracts a substring from a byte array using SQL-compatible semantics
 * @param bytes the source byte array
 * @param pos the starting position (1-based for positive, end-relative for negative)
 * @param len the maximum length of the substring
 * @return new byte array containing the extracted substring
 */
public static byte[] subStringSQL(byte[] bytes, int pos, int len);

Usage Examples:

byte[] text = "Hello World".getBytes();

// Extract substring starting from position 2, length 5
byte[] result1 = ByteArray.subStringSQL(text, 2, 5); // "ello "

// Extract from end using negative position
byte[] result2 = ByteArray.subStringSQL(text, -5, 5); // "World"

// Convert back to string
String str1 = new String(result1); // "ello "
String str2 = new String(result2); // "World"

Array Concatenation

Efficient concatenation of multiple byte arrays with null handling.

/**
 * Concatenates multiple byte arrays into a single array
 * @param inputs variable number of byte arrays to concatenate
 * @return new byte array containing all input arrays concatenated, or null if any input is null
 */
public static byte[] concat(byte[]... inputs);

Usage Example:

byte[] part1 = "Hello".getBytes();
byte[] part2 = " ".getBytes();
byte[] part3 = "World".getBytes();

byte[] result = ByteArray.concat(part1, part2, part3);
String combined = new String(result); // "Hello World"

Padding Operations

Left and right padding operations with pattern-based filling.

/**
 * Left-pads a byte array to the specified length using a padding pattern
 * @param bytes the input byte array to pad
 * @param len the desired total length of the result
 * @param pad the padding pattern to use for filling
 * @return new byte array with left padding applied
 */
public static byte[] lpad(byte[] bytes, int len, byte[] pad);

/**
 * Right-pads a byte array to the specified length using a padding pattern
 * @param bytes the input byte array to pad  
 * @param len the desired total length of the result
 * @param pad the padding pattern to use for filling
 * @return new byte array with right padding applied
 */
public static byte[] rpad(byte[] bytes, int len, byte[] pad);

Usage Examples:

byte[] text = "abc".getBytes();
byte[] pattern = "xy".getBytes();

// Left pad to length 7 with "xy" pattern
byte[] leftPadded = ByteArray.lpad(text, 7, pattern);
// Result: "xyxyabc"

// Right pad to length 7 with "xy" pattern  
byte[] rightPadded = ByteArray.rpad(text, 7, pattern);
// Result: "abcxyxy"

// Convert results to strings
String left = new String(leftPadded);   // "xyxyabc"
String right = new String(rightPadded); // "abcxyxy"

Special Cases

// Empty pattern padding (truncates or copies)
byte[] truncated = ByteArray.lpad("toolong".getBytes(), 3, new byte[0]);
// Result: "too"

// Zero length padding
byte[] empty = ByteArray.lpad("test".getBytes(), 0, "x".getBytes());
// Result: empty byte array

Constants

/**
 * Shared empty byte array constant to avoid repeated allocations
 */
public static final byte[] EMPTY_BYTE;

Architecture Notes

  • Platform Optimized: Uses word-aligned operations for maximum performance
  • Endianness Aware: Handles both little-endian and big-endian architectures
  • Memory Efficient: Uses direct memory copying to minimize allocations
  • SQL Compatible: String operations follow SQL semantics for compatibility
  • Null Safe: Proper null handling in concatenation and padding operations

Import Requirements

import org.apache.spark.unsafe.types.ByteArray;
import org.apache.spark.unsafe.Platform; // Required for memory operations