or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

blas.mddistributions.mdindex.mdmatrices.mdvectors.md
tile.json

tessl/maven-org-apache-spark--spark-mllib-local

Apache Spark ML Local Library provides local implementations of linear algebra data structures and utilities for machine learning without requiring a Spark cluster.

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-mllib-local_2.12@2.4.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-mllib-local@2.4.0

index.mddocs/

Apache Spark MLlib Local Library

Apache Spark MLlib Local Library provides local implementations of linear algebra data structures and utilities for machine learning without requiring a Spark cluster. It includes core linear algebra components such as Vector and Matrix implementations, BLAS operations, and statistical distributions like MultivariateGaussian.

Package Information

  • Package Name: spark-mllib-local_2.12
  • Package Type: maven
  • Language: Scala
  • Installation: org.apache.spark:spark-mllib-local_2.12:2.4.8

Core Imports

import org.apache.spark.ml.linalg._

For statistical distributions:

import org.apache.spark.ml.stat.distribution.MultivariateGaussian

Basic Usage

import org.apache.spark.ml.linalg._

// Create vectors
val denseVec = Vectors.dense(1.0, 2.0, 3.0)
val sparseVec = Vectors.sparse(4, Array(0, 2), Array(1.0, 3.0))

// Create matrices  
val denseMatrix = Matrices.dense(2, 2, Array(1.0, 3.0, 2.0, 4.0))
val sparseMatrix = Matrices.sparse(2, 2, Array(0, 1, 2), Array(0, 1), Array(1.0, 4.0))

// Vector operations
println(s"Dense vector size: ${denseVec.size}")
println(s"Sparse vector nonzeros: ${sparseVec.numNonzeros}")

// Matrix operations
val result = denseMatrix.multiply(denseVec)
println(s"Matrix-vector product: ${result.toArray.mkString(",")}")

// BLAS operations
val dotProduct = BLAS.dot(denseVec, denseVec)
val norm = Vectors.norm(denseVec, 2.0)

Architecture

The library is built around a few core abstractions:

  • Vector: Sealed trait with DenseVector/SparseVector implementations for efficient storage of 1D data
  • Matrix: Sealed trait with DenseMatrix/SparseMatrix implementations supporting both column-major and row-major layouts
  • BLAS: Object providing optimized linear algebra routines compatible with netlib-java
  • Factory Objects: Vectors, Matrices objects providing convenient creation methods
  • Statistical Distributions: MultivariateGaussian for probabilistic modeling

The design emphasizes performance through integration with Breeze linear algebra library and supports seamless conversion between dense and sparse formats based on sparsity patterns.

Capabilities

Vector Operations

Core vector data structures and operations including dense and sparse vectors, with comprehensive linear algebra functionality, norms, distances, and format conversions.

sealed trait Vector {
  def size: Int
  def toArray: Array[Double]
  def apply(i: Int): Double
  def copy: Vector
  def numNonzeros: Int
  def toSparse: SparseVector
  def toDense: DenseVector
}

object Vectors {
  def dense(values: Array[Double]): Vector
  def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector
  def zeros(size: Int): Vector
  def norm(vector: Vector, p: Double): Double
  def sqdist(v1: Vector, v2: Vector): Double
}

Vector Operations

Matrix Operations

Matrix data structures and operations including dense and sparse matrices with support for various layouts, linear algebra operations, and efficient storage format conversions.

sealed trait Matrix {
  def numRows: Int
  def numCols: Int
  def apply(i: Int, j: Int): Double
  def transpose: Matrix
  def multiply(y: DenseMatrix): DenseMatrix
  def multiply(y: Vector): DenseVector
  def toSparse: SparseMatrix
  def toDense: DenseMatrix
}

object Matrices {
  def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix
  def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix
  def zeros(numRows: Int, numCols: Int): Matrix
  def eye(n: Int): Matrix
}

Matrix Operations

BLAS Operations

High-performance Basic Linear Algebra Subprograms (BLAS) routines for vectors and matrices, providing optimized implementations of common linear algebra operations.

object BLAS {
  def axpy(a: Double, x: Vector, y: Vector): Unit
  def dot(x: Vector, y: Vector): Double
  def copy(x: Vector, y: Vector): Unit
  def scal(a: Double, x: Vector): Unit
  def gemv(alpha: Double, A: Matrix, x: Vector, beta: Double, y: DenseVector): Unit
  def gemm(alpha: Double, A: Matrix, B: DenseMatrix, beta: Double, C: DenseMatrix): Unit
}

BLAS Operations

Statistical Distributions

Multivariate statistical distributions for probabilistic modeling and machine learning applications, with support for probability density calculations.

class MultivariateGaussian(mean: Vector, cov: Matrix) {
  def pdf(x: Vector): Double
  def logpdf(x: Vector): Double  
}

Statistical Distributions