Local linear algebra operations and utilities for Apache Spark's MLlib machine learning library
npx @tessl/cli install tessl/maven-org-apache-spark--spark-mllib-local-2-12@3.5.0Spark MLlib Local provides local linear algebra operations and utilities for Apache Spark's machine learning library. This library implements core data structures including Vector and Matrix types, along with optimized BLAS (Basic Linear Algebra Subprograms) operations for numerical computations in distributed machine learning applications.
Maven:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib-local_2.12</artifactId>
<version>3.5.6</version>
</dependency>SBT:
libraryDependencies += "org.apache.spark" %% "spark-mllib-local" % "3.5.6"import org.apache.spark.ml.linalg._
import org.apache.spark.ml.stat.distribution.MultivariateGaussian
import org.apache.spark.ml.impl.Utilsimport org.apache.spark.ml.linalg._
// Create vectors
val denseVec = Vectors.dense(1.0, 2.0, 3.0)
val sparseVec = Vectors.sparse(5, Array(0, 2, 4), Array(1.0, 3.0, 5.0))
// Create matrices
val denseMatrix = DenseMatrix.zeros(3, 3)
val sparseMatrix = SparseMatrix.speye(3) // Identity matrix
// Linear algebra operations
val dotProduct = BLAS.dot(denseVec, denseVec)
val denseCopy = denseVec.toDense
BLAS.axpy(2.0, denseVec, denseCopy) // denseCopy += 2.0 * denseVec
// Statistical distributions
val mean = Vectors.dense(0.0, 0.0)
val cov = DenseMatrix.eye(2)
val mvn = new MultivariateGaussian(mean, cov)
val density = mvn.pdf(Vectors.dense(1.0, 1.0))Spark MLlib Local is built around several key components:
Core vector data structures and operations for numerical computing. Supports both dense and sparse representations with automatic optimization.
sealed trait Vector extends Serializable {
def size: Int
def toArray: Array[Double]
def apply(i: Int): Double
def copy: Vector
def dot(v: Vector): Double
def numNonzeros: Int
def compressed: Vector
}
object Vectors {
def dense(values: Array[Double]): Vector
def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector
def zeros(size: Int): Vector
def norm(vector: Vector, p: Double): Double
def sqdist(v1: Vector, v2: Vector): Double
}Matrix data structures and operations for linear algebra computations. Provides both dense and sparse implementations with format conversion capabilities.
sealed trait Matrix extends Serializable {
def numRows: Int
def numCols: Int
def apply(i: Int, j: Int): Double
def transpose: Matrix
def multiply(y: Vector): DenseVector
def multiply(y: DenseMatrix): DenseMatrix
def compressed: Matrix
}
object Matrices {
def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix
def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix
def zeros(numRows: Int, numCols: Int): Matrix
def eye(n: Int): Matrix
}Optimized linear algebra operations accessible through Vector and Matrix APIs. Underlying BLAS implementation provides automatic native acceleration.
// Vector operations (accessing optimized BLAS internally)
val dotProduct = vector1.dot(vector2)
// Matrix operations (accessing optimized BLAS internally)
val result = matrix.multiply(vector)
val product = matrix.multiply(otherMatrix)Multivariate probability distributions with numerical stability and support for singular covariance matrices.
class MultivariateGaussian(mean: Vector, cov: Matrix) extends Serializable {
def pdf(x: Vector): Double
def logpdf(x: Vector): Double
}Numerical utility functions and mathematical helpers for robust computations with numerical stability considerations.
object Utils {
lazy val EPSILON: Double
def unpackUpperTriangular(n: Int, triangularValues: Array[Double]): Array[Double]
def indexUpperTriangular(n: Int, i: Int, j: Int): Int
def log1pExp(x: Double): Double
def softmax(array: Array[Double]): Unit
}The library uses standard Scala exception handling:
IllegalArgumentException: Invalid parameters or dimension mismatchesUnsupportedOperationException: Operations not supported for specific vector/matrix typesIndexOutOfBoundsException: Invalid indicesNoSuchElementException: Attempting to update zero elements in sparse matricesOperations validate input dimensions and throw descriptive exceptions for invalid operations.