or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

blas.mddistributions.mdindex.mdmatrices.mdutils.mdvectors.md
tile.json

utils.mddocs/

Utility Functions

Numerical utility functions and mathematical helpers for linear algebra operations and statistical computations. These utilities provide robust implementations for common mathematical operations with numerical stability considerations.

Capabilities

Mathematical Constants

Machine precision and numerical stability constants.

object Utils {
  /**
   * Machine epsilon value for numerical tolerance calculations
   * Computed as the smallest value where (1.0 + epsilon/2) != 1.0
   * @return Machine epsilon for Double precision
   */
  lazy val EPSILON: Double
}

Matrix Utilities

Utilities for working with packed triangular matrix storage formats.

object Utils {
  /**
   * Convert upper triangular packed matrix to full symmetric matrix
   * @param n Order of the n x n matrix
   * @param triangularValues Upper triangular part in packed array (column major)
   * @return Dense matrix representing the full symmetric matrix (column major)
   */
  def unpackUpperTriangular(n: Int, triangularValues: Array[Double]): Array[Double]
  
  /**
   * Get index in packed upper triangular matrix format
   * @param n Order of the n x n matrix
   * @param i Row index (0-based)
   * @param j Column index (0-based)
   * @return Index in packed triangular array
   */
  def indexUpperTriangular(n: Int, i: Int, j: Int): Int
}

Usage Examples:

import org.apache.spark.ml.impl.Utils
import org.apache.spark.ml.linalg._

// Machine epsilon for numerical comparisons
val tolerance = Utils.EPSILON * 1000
val isZero = math.abs(someValue) < tolerance

// Working with packed triangular matrices
val n = 3
val packedMatrix = Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0) // Upper triangular part

// Convert to full symmetric matrix
val fullMatrix = Utils.unpackUpperTriangular(n, packedMatrix)
val matrix = new DenseMatrix(n, n, fullMatrix)

// Access specific element in packed format
val index = Utils.indexUpperTriangular(n, 1, 2) // Index for element (1,2)
val value = packedMatrix(index)

Numerical Stability Functions

Functions for numerically stable mathematical operations.

object Utils {
  /**
   * Numerically stable computation of log(1 + exp(x))
   * Prevents arithmetic overflow for large positive x values
   * @param x Input value
   * @return log(1 + exp(x)) computed in numerically stable way
   */
  def log1pExp(x: Double): Double
}

Usage Examples:

import org.apache.spark.ml.impl.Utils

// Safe computation avoiding overflow
val largeX = 800.0
val result = Utils.log1pExp(largeX) // Would overflow with naive math.log(1 + math.exp(x))

val smallX = -10.0
val result2 = Utils.log1pExp(smallX) // Handles negative values correctly

Softmax Operations

Numerically stable softmax computations for probability distributions.

object Utils {
  /**
   * Perform in-place softmax conversion on array
   * @param array Array to convert (modified in-place)
   */
  def softmax(array: Array[Double]): Unit
  
  /**
   * Perform softmax conversion with flexible indexing
   * @param input Input array
   * @param n Number of elements to process
   * @param offset Starting offset in input array
   * @param step Step size between elements
   * @param output Output array for results
   */
  def softmax(
    input: Array[Double],
    n: Int,
    offset: Int,
    step: Int,
    output: Array[Double]
  ): Unit
}

Usage Examples:

import org.apache.spark.ml.impl.Utils

// Simple in-place softmax
val logits = Array(2.0, 1.0, 0.1)
Utils.softmax(logits) // logits now contains probabilities that sum to 1.0

// Advanced softmax with custom indexing
val input = Array(1.0, 5.0, 2.0, 3.0, 1.0, 2.0)
val output = Array.ofDim[Double](6)

// Process every other element starting from index 1
Utils.softmax(
  input = input,
  n = 3,           // Process 3 elements
  offset = 1,      // Start at index 1
  step = 2,        // Skip every other element  
  output = output
)
// output(1), output(3), output(5) contain softmax probabilities

Implementation Details

Numerical Stability

The utility functions implement several numerical stability techniques:

  1. Machine Epsilon: Computed dynamically to match the current system's floating-point precision
  2. Overflow Prevention: log1pExp uses conditional logic to prevent arithmetic overflow
  3. Softmax Stability: Subtracts maximum value before exponentiation to prevent overflow
  4. Infinity Handling: Special cases for positive infinity in softmax operations

Memory Efficiency

  • In-place Operations: Softmax can operate in-place to minimize memory allocation
  • Flexible Indexing: Advanced softmax supports strided access patterns for efficient memory usage
  • Lazy Evaluation: EPSILON is computed only once using lazy initialization

Error Handling

// These operations validate inputs and throw exceptions for invalid parameters:

// Index validation in triangular matrix operations
Utils.indexUpperTriangular(-1, 0, 0) // throws IllegalArgumentException
Utils.indexUpperTriangular(3, 5, 0)  // throws IllegalArgumentException

// Array bounds checking in softmax operations
val tooSmall = Array(1.0)
Utils.softmax(tooSmall, 5, 0, 1, Array.ofDim[Double](5)) // may throw exception

Integration with Other Components

These utilities are used internally throughout Spark MLlib:

  • EPSILON: Used in MultivariateGaussian for singular value tolerance calculations
  • Triangular Operations: Used in symmetric matrix operations and BLAS routines
  • log1pExp: Used in logistic regression and other probabilistic models
  • Softmax: Used in neural networks and multi-class classification algorithms

Mathematical Background

Machine Epsilon

Machine epsilon (ε) is the smallest positive number such that 1 + ε ≠ 1 in floating-point arithmetic. It's computed iteratively:

var eps = 1.0
while ((1.0 + (eps / 2.0)) != 1.0) {
  eps /= 2.0
}

Packed Triangular Storage

Upper triangular matrices can be stored compactly by storing only the upper triangle:

Matrix:     Packed Storage:
[a b c]  →  [a d b e c f]
[0 d e]
[0 0 f]

The index mapping is: index = j * (j + 1) / 2 + i for i ≤ j.

Log1pExp Stability

For large x, computing log(1 + exp(x)) directly causes overflow. The stable version uses:

log1pExp(x) = x + log1p(exp(-x)) for x > 0
            = log1p(exp(x))        for x ≤ 0