or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/maven-org-apache-spark--spark-mllib-local-2-13

Spark Project ML Local Library provides local linear algebra operations for machine learning without requiring a distributed Spark context

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
mavenpkg:maven/org.apache.spark/spark-mllib-local_2.13@3.5.x

To install, run

npx @tessl/cli install tessl/maven-org-apache-spark--spark-mllib-local-2-13@3.5.0

0

# Spark MLlib Local

1

2

Spark MLlib Local provides local linear algebra operations for machine learning without requiring a distributed Spark context. It includes vector and matrix data structures (dense and sparse), statistical distributions, and utility functions that can operate independently of a distributed Spark cluster.

3

4

## Package Information

5

6

- **Package Name**: spark-mllib-local_2.13

7

- **Package Type**: maven

8

- **Language**: Scala

9

- **Installation**: Add to `pom.xml`: `<groupId>org.apache.spark</groupId><artifactId>spark-mllib-local_2.13</artifactId><version>3.5.6</version>`

10

- **Gradle**: `implementation 'org.apache.spark:spark-mllib-local_2.13:3.5.6'`

11

12

## Core Imports

13

14

```scala

15

import org.apache.spark.ml.linalg.{Vector, DenseVector, SparseVector, Vectors}

16

import org.apache.spark.ml.linalg.{Matrix, DenseMatrix, SparseMatrix, Matrices}

17

import org.apache.spark.ml.stat.distribution.MultivariateGaussian

18

```

19

20

## Basic Usage

21

22

```scala

23

import org.apache.spark.ml.linalg.{Vectors, Matrices}

24

25

// Create vectors

26

val denseVec = Vectors.dense(1.0, 2.0, 3.0)

27

val sparseVec = Vectors.sparse(5, Array(0, 2, 4), Array(1.0, 3.0, 5.0))

28

29

// Create matrices

30

val denseMatrix = Matrices.dense(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))

31

val sparseMatrix = Matrices.sparse(3, 3, Array(0, 1, 2, 3), Array(0, 1, 2), Array(1.0, 2.0, 3.0))

32

33

// Basic operations

34

val dotProduct = denseVec.dot(sparseVec)

35

val matVecProduct = denseMatrix.multiply(denseVec)

36

val norm = Vectors.norm(denseVec, 2.0)

37

```

38

39

## Architecture

40

41

Spark MLlib Local is built around several key components:

42

43

- **Vector API**: Unified interface for dense and sparse vector operations with automatic format optimization

44

- **Matrix API**: Comprehensive matrix operations supporting both dense and sparse representations

45

- **Statistical Distributions**: Multivariate probability distributions for machine learning algorithms

46

- **Type Safety**: Strong typing with sealed traits preventing invalid operations

47

48

## Capabilities

49

50

### Vector Operations

51

52

Core vector functionality supporting both dense and sparse representations with automatic optimization and conversion capabilities.

53

54

```scala { .api }

55

// Vector trait and factory methods

56

trait Vector extends Serializable {

57

def size: Int

58

def apply(i: Int): Double

59

def toArray: Array[Double]

60

def dot(v: Vector): Double

61

}

62

63

object Vectors {

64

def dense(values: Array[Double]): Vector

65

def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector

66

def zeros(size: Int): Vector

67

def norm(vector: Vector, p: Double): Double

68

}

69

```

70

71

[Vector Operations](./vectors.md)

72

73

### Matrix Operations

74

75

Matrix operations supporting dense and sparse formats, with efficient multiplication, transposition, and format conversion.

76

77

```scala { .api }

78

// Matrix trait and factory methods

79

trait Matrix extends Serializable {

80

def numRows: Int

81

def numCols: Int

82

def apply(i: Int, j: Int): Double

83

def multiply(y: Vector): DenseVector

84

def transpose: Matrix

85

}

86

87

object Matrices {

88

def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix

89

def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix

90

def zeros(numRows: Int, numCols: Int): Matrix

91

}

92

```

93

94

[Matrix Operations](./matrices.md)

95

96

97

### Statistical Distributions

98

99

Multivariate statistical distributions for machine learning applications with support for probability density functions.

100

101

```scala { .api }

102

// Multivariate Gaussian distribution

103

class MultivariateGaussian(mean: Vector, cov: Matrix) extends Serializable {

104

def pdf(x: Vector): Double

105

def logpdf(x: Vector): Double

106

}

107

```

108

109

[Statistical Distributions](./distributions.md)

110

111

## Types

112

113

```scala { .api }

114

// Core vector types

115

trait Vector extends Serializable {

116

def size: Int

117

def toArray: Array[Double]

118

def apply(i: Int): Double

119

def copy: Vector

120

def foreachActive(f: (Int, Double) => Unit): Unit

121

def numActives: Int

122

def numNonzeros: Int

123

def toSparse: SparseVector

124

def toDense: DenseVector

125

def compressed: Vector

126

def argmax: Int

127

def dot(v: Vector): Double

128

}

129

130

class DenseVector(val values: Array[Double]) extends Vector

131

class SparseVector(override val size: Int, val indices: Array[Int], val values: Array[Double]) extends Vector

132

133

// Core matrix types

134

trait Matrix extends Serializable {

135

def numRows: Int

136

def numCols: Int

137

def apply(i: Int, j: Int): Double

138

def copy: Matrix

139

def transpose: Matrix

140

def multiply(y: DenseMatrix): DenseMatrix

141

def multiply(y: Vector): DenseVector

142

def foreachActive(f: (Int, Int, Double) => Unit): Unit

143

def numNonzeros: Int

144

def numActives: Int

145

def toSparse: SparseMatrix

146

def toDense: DenseMatrix

147

}

148

149

class DenseMatrix(val numRows: Int, val numCols: Int, val values: Array[Double], override val isTransposed: Boolean) extends Matrix

150

class SparseMatrix(val numRows: Int, val numCols: Int, val colPtrs: Array[Int], val rowIndices: Array[Int], val values: Array[Double], override val isTransposed: Boolean) extends Matrix

151

```