or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

blas.mddistributions.mdindex.mdmatrices.mdutils.mdvectors.md

index.mddocs/

0

# Spark MLlib Local

1

2

Spark MLlib Local provides local linear algebra operations and utilities for Apache Spark's machine learning library. This library implements core data structures including Vector and Matrix types, along with optimized BLAS (Basic Linear Algebra Subprograms) operations for numerical computations in distributed machine learning applications.

3

4

## Package Information

5

6

- **Package Name**: spark-mllib-local_2.12

7

- **Package Type**: maven

8

- **Language**: Scala

9

- **Installation**: Add to your Maven/SBT dependencies:

10

11

Maven:

12

```xml

13

<dependency>

14

<groupId>org.apache.spark</groupId>

15

<artifactId>spark-mllib-local_2.12</artifactId>

16

<version>3.5.6</version>

17

</dependency>

18

```

19

20

SBT:

21

```scala

22

libraryDependencies += "org.apache.spark" %% "spark-mllib-local" % "3.5.6"

23

```

24

25

## Core Imports

26

27

```scala

28

import org.apache.spark.ml.linalg._

29

import org.apache.spark.ml.stat.distribution.MultivariateGaussian

30

import org.apache.spark.ml.impl.Utils

31

```

32

33

## Basic Usage

34

35

```scala

36

import org.apache.spark.ml.linalg._

37

38

// Create vectors

39

val denseVec = Vectors.dense(1.0, 2.0, 3.0)

40

val sparseVec = Vectors.sparse(5, Array(0, 2, 4), Array(1.0, 3.0, 5.0))

41

42

// Create matrices

43

val denseMatrix = DenseMatrix.zeros(3, 3)

44

val sparseMatrix = SparseMatrix.speye(3) // Identity matrix

45

46

// Linear algebra operations

47

val dotProduct = BLAS.dot(denseVec, denseVec)

48

val denseCopy = denseVec.toDense

49

BLAS.axpy(2.0, denseVec, denseCopy) // denseCopy += 2.0 * denseVec

50

51

// Statistical distributions

52

val mean = Vectors.dense(0.0, 0.0)

53

val cov = DenseMatrix.eye(2)

54

val mvn = new MultivariateGaussian(mean, cov)

55

val density = mvn.pdf(Vectors.dense(1.0, 1.0))

56

```

57

58

## Architecture

59

60

Spark MLlib Local is built around several key components:

61

62

- **Vector Types**: Dense and sparse vector implementations with unified API

63

- **Matrix Types**: Dense and sparse matrix implementations with lazy transposition

64

- **BLAS Operations**: Optimized linear algebra routines with native acceleration

65

- **Statistical Distributions**: Multivariate probability distributions with numerical stability

66

- **Type Safety**: Sealed trait hierarchies ensuring compile-time correctness

67

- **Native Integration**: Automatic fallback from native BLAS to pure JVM implementations

68

69

## Capabilities

70

71

### Vector Operations

72

73

Core vector data structures and operations for numerical computing. Supports both dense and sparse representations with automatic optimization.

74

75

```scala { .api }

76

sealed trait Vector extends Serializable {

77

def size: Int

78

def toArray: Array[Double]

79

def apply(i: Int): Double

80

def copy: Vector

81

def dot(v: Vector): Double

82

def numNonzeros: Int

83

def compressed: Vector

84

}

85

86

object Vectors {

87

def dense(values: Array[Double]): Vector

88

def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector

89

def zeros(size: Int): Vector

90

def norm(vector: Vector, p: Double): Double

91

def sqdist(v1: Vector, v2: Vector): Double

92

}

93

```

94

95

[Vector Operations](./vectors.md)

96

97

### Matrix Operations

98

99

Matrix data structures and operations for linear algebra computations. Provides both dense and sparse implementations with format conversion capabilities.

100

101

```scala { .api }

102

sealed trait Matrix extends Serializable {

103

def numRows: Int

104

def numCols: Int

105

def apply(i: Int, j: Int): Double

106

def transpose: Matrix

107

def multiply(y: Vector): DenseVector

108

def multiply(y: DenseMatrix): DenseMatrix

109

def compressed: Matrix

110

}

111

112

object Matrices {

113

def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix

114

def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix

115

def zeros(numRows: Int, numCols: Int): Matrix

116

def eye(n: Int): Matrix

117

}

118

```

119

120

[Matrix Operations](./matrices.md)

121

122

### Linear Algebra Operations

123

124

Optimized linear algebra operations accessible through Vector and Matrix APIs. Underlying BLAS implementation provides automatic native acceleration.

125

126

```scala { .api }

127

// Vector operations (accessing optimized BLAS internally)

128

val dotProduct = vector1.dot(vector2)

129

130

// Matrix operations (accessing optimized BLAS internally)

131

val result = matrix.multiply(vector)

132

val product = matrix.multiply(otherMatrix)

133

```

134

135

[Linear Algebra Operations](./blas.md)

136

137

### Statistical Distributions

138

139

Multivariate probability distributions with numerical stability and support for singular covariance matrices.

140

141

```scala { .api }

142

class MultivariateGaussian(mean: Vector, cov: Matrix) extends Serializable {

143

def pdf(x: Vector): Double

144

def logpdf(x: Vector): Double

145

}

146

```

147

148

[Statistical Distributions](./distributions.md)

149

150

### Utility Functions

151

152

Numerical utility functions and mathematical helpers for robust computations with numerical stability considerations.

153

154

```scala { .api }

155

object Utils {

156

lazy val EPSILON: Double

157

def unpackUpperTriangular(n: Int, triangularValues: Array[Double]): Array[Double]

158

def indexUpperTriangular(n: Int, i: Int, j: Int): Int

159

def log1pExp(x: Double): Double

160

def softmax(array: Array[Double]): Unit

161

}

162

```

163

164

[Utility Functions](./utils.md)

165

166

## Error Handling

167

168

The library uses standard Scala exception handling:

169

170

- `IllegalArgumentException`: Invalid parameters or dimension mismatches

171

- `UnsupportedOperationException`: Operations not supported for specific vector/matrix types

172

- `IndexOutOfBoundsException`: Invalid indices

173

- `NoSuchElementException`: Attempting to update zero elements in sparse matrices

174

175

Operations validate input dimensions and throw descriptive exceptions for invalid operations.