0
# Linear Algebra Operations
1
2
Core vector and matrix operations including creation, manipulation, arithmetic operations, and format conversions. The library provides both dense and sparse implementations with automatic optimization for storage efficiency and computational performance.
3
4
## Capabilities
5
6
### Vector Creation
7
8
Factory methods for creating vectors in dense and sparse formats with various initialization patterns.
9
10
```scala { .api }
11
import java.lang.{Double => JavaDouble, Integer => JavaInteger, Iterable => JavaIterable}
12
13
object Vectors {
14
@varargs
15
def dense(firstValue: Double, otherValues: Double*): Vector
16
def dense(values: Array[Double]): Vector
17
def sparse(size: Int, indices: Array[Int], values: Array[Double]): Vector
18
def sparse(size: Int, elements: Seq[(Int, Double)]): Vector
19
def sparse(size: Int, elements: JavaIterable[(JavaInteger, JavaDouble)]): Vector
20
def zeros(size: Int): Vector
21
def norm(vector: Vector, p: Double): Double
22
def sqdist(v1: Vector, v2: Vector): Double
23
}
24
```
25
26
**Usage examples:**
27
28
```scala
29
import org.apache.spark.ml.linalg._
30
31
// Dense vector creation
32
val dense1 = Vectors.dense(1.0, 2.0, 3.0, 4.0)
33
val dense2 = Vectors.dense(Array(1.0, 2.0, 3.0, 4.0))
34
35
// Sparse vector creation
36
val sparse1 = Vectors.sparse(4, Array(0, 2), Array(1.0, 3.0))
37
val sparse2 = Vectors.sparse(4, Seq((0, 1.0), (2, 3.0)))
38
39
// Zero vector
40
val zeros = Vectors.zeros(4)
41
```
42
43
### Vector Operations
44
45
Mathematical operations on vectors including norms, distances, and element access.
46
47
```scala { .api }
48
// Vector trait methods
49
trait Vector {
50
def size: Int
51
def toArray: Array[Double]
52
def apply(i: Int): Double
53
def copy: Vector
54
def foreachActive(f: (Int, Double) => Unit): Unit
55
def numActives: Int
56
def numNonzeros: Int
57
def toSparse: SparseVector
58
def toDense: DenseVector
59
def compressed: Vector
60
def argmax: Int
61
def dot(v: Vector): Double
62
}
63
64
// Static utility methods
65
object Vectors {
66
def norm(vector: Vector, p: Double): Double
67
def sqdist(v1: Vector, v2: Vector): Double
68
}
69
```
70
71
**Usage examples:**
72
73
```scala
74
val v1 = Vectors.dense(1.0, 2.0, 3.0)
75
val v2 = Vectors.sparse(3, Array(0, 2), Array(1.0, 3.0))
76
77
// Element access
78
val element = v1(1) // Returns 2.0
79
80
// Vector properties
81
val size = v1.size
82
val nnz = v1.numNonzeros
83
val actives = v1.numActives
84
85
// Vector operations
86
val l2Norm = Vectors.norm(v1, 2.0)
87
val l1Norm = Vectors.norm(v1, 1.0)
88
val distance = Vectors.sqdist(v1, v2)
89
val dotProduct = v1.dot(v2)
90
91
// Format conversions
92
val dense = v2.toDense
93
val sparse = v1.toSparse
94
val compressed = v1.compressed
95
96
// Find maximum element index
97
val maxIdx = v1.argmax
98
```
99
100
### Matrix Creation
101
102
Factory methods for creating matrices in dense and sparse formats with support for various initialization patterns and random generation.
103
104
```scala { .api }
105
import java.util.Random
106
107
object Matrices {
108
def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix
109
def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix
110
def zeros(numRows: Int, numCols: Int): Matrix
111
def ones(numRows: Int, numCols: Int): Matrix
112
def eye(n: Int): Matrix
113
def speye(n: Int): Matrix
114
def rand(numRows: Int, numCols: Int, rng: Random): Matrix
115
def sprand(numRows: Int, numCols: Int, density: Double, rng: Random): Matrix
116
def randn(numRows: Int, numCols: Int, rng: Random): Matrix
117
def sprandn(numRows: Int, numCols: Int, density: Double, rng: Random): Matrix
118
def diag(vector: Vector): Matrix
119
def horzcat(matrices: Array[Matrix]): Matrix
120
def vertcat(matrices: Array[Matrix]): Matrix
121
}
122
123
object DenseMatrix {
124
def zeros(numRows: Int, numCols: Int): DenseMatrix
125
def ones(numRows: Int, numCols: Int): DenseMatrix
126
def eye(n: Int): DenseMatrix
127
def rand(numRows: Int, numCols: Int, rng: Random): DenseMatrix
128
def randn(numRows: Int, numCols: Int, rng: Random): DenseMatrix
129
def diag(vector: Vector): DenseMatrix
130
}
131
132
object SparseMatrix {
133
def fromCOO(numRows: Int, numCols: Int, entries: Iterable[(Int, Int, Double)]): SparseMatrix
134
def speye(n: Int): SparseMatrix
135
def sprand(numRows: Int, numCols: Int, density: Double, rng: Random): SparseMatrix
136
def sprandn(numRows: Int, numCols: Int, density: Double, rng: Random): SparseMatrix
137
def spdiag(vector: Vector): SparseMatrix
138
}
139
```
140
141
**Usage examples:**
142
143
```scala
144
import org.apache.spark.ml.linalg._
145
import java.util.Random
146
147
// Dense matrix creation
148
val dense = Matrices.dense(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
149
val zeros = Matrices.zeros(3, 3)
150
val ones = Matrices.ones(2, 2)
151
val identity = Matrices.eye(3)
152
153
// Sparse matrix creation
154
val sparse = Matrices.sparse(3, 3, Array(0, 1, 2, 3), Array(0, 1, 2), Array(1.0, 2.0, 3.0))
155
val sparseIdentity = Matrices.speye(3)
156
157
// Random matrices
158
val rng = new Random(42)
159
val randDense = Matrices.rand(3, 3, rng)
160
val randSparse = Matrices.sprand(3, 3, 0.3, rng)
161
162
// Diagonal matrices
163
val diagVec = Vectors.dense(1.0, 2.0, 3.0)
164
val diagMatrix = Matrices.diag(diagVec)
165
166
// Matrix concatenation
167
val m1 = Matrices.ones(2, 2)
168
val m2 = Matrices.zeros(2, 2)
169
val horizontal = Matrices.horzcat(Array(m1, m2))
170
val vertical = Matrices.vertcat(Array(m1, m2))
171
172
// COO format creation
173
val entries = Seq((0, 0, 1.0), (1, 1, 2.0), (2, 2, 3.0))
174
val coo = SparseMatrix.fromCOO(3, 3, entries)
175
```
176
177
### Matrix Operations
178
179
Operations on matrices including element access, arithmetic operations, transposition, and format conversions.
180
181
```scala { .api }
182
trait Matrix {
183
def numRows: Int
184
def numCols: Int
185
val isTransposed: Boolean
186
def toArray: Array[Double]
187
def colIter: Iterator[Vector]
188
def rowIter: Iterator[Vector]
189
def apply(i: Int, j: Int): Double
190
def copy: Matrix
191
def transpose: Matrix
192
def multiply(y: DenseMatrix): DenseMatrix
193
def multiply(y: DenseVector): DenseVector
194
def multiply(y: Vector): DenseVector
195
def toString(maxLines: Int, maxLineWidth: Int): String
196
def foreachActive(f: (Int, Int, Double) => Unit): Unit
197
def numNonzeros: Int
198
def numActives: Int
199
def toSparseColMajor: SparseMatrix
200
def toSparseRowMajor: SparseMatrix
201
def toSparse: SparseMatrix
202
def toDense: DenseMatrix
203
def toDenseRowMajor: DenseMatrix
204
def toDenseColMajor: DenseMatrix
205
def compressedColMajor: Matrix
206
def compressedRowMajor: Matrix
207
def compressed: Matrix
208
}
209
```
210
211
**Usage examples:**
212
213
```scala
214
val matrix = Matrices.dense(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
215
val vector = Vectors.dense(1.0, 2.0, 3.0)
216
217
// Matrix properties
218
val rows = matrix.numRows
219
val cols = matrix.numCols
220
val isTransposed = matrix.isTransposed
221
222
// Element access
223
val element = matrix(0, 1)
224
225
// Matrix operations
226
val transposed = matrix.transpose
227
val copy = matrix.copy
228
229
// Matrix-vector multiplication
230
val result1 = matrix.multiply(vector)
231
232
// Matrix-matrix multiplication
233
val other = DenseMatrix.ones(3, 2)
234
val result2 = matrix.multiply(other)
235
236
// Format conversions
237
val sparse = matrix.toSparse
238
val denseColMajor = matrix.toDenseColMajor
239
val compressed = matrix.compressed
240
241
// Iteration over columns/rows
242
matrix.colIter.foreach { col =>
243
println(s"Column: ${col.toArray.mkString(", ")}")
244
}
245
246
// Apply function to active elements
247
matrix.foreachActive { (i, j, value) =>
248
println(s"matrix($i, $j) = $value")
249
}
250
251
// Statistics
252
val nnz = matrix.numNonzeros
253
val actives = matrix.numActives
254
```
255
256
### Vector Types
257
258
Dense and sparse vector implementations with different storage characteristics.
259
260
```scala { .api }
261
class DenseVector(val values: Array[Double]) extends Vector {
262
// Inherits all Vector methods
263
// Direct array access for maximum performance
264
}
265
266
class SparseVector(
267
override val size: Int,
268
val indices: Array[Int],
269
val values: Array[Double]
270
) extends Vector {
271
// Inherits all Vector methods
272
// Compressed storage for sparse data
273
}
274
```
275
276
### Matrix Types
277
278
Dense and sparse matrix implementations supporting different storage layouts.
279
280
```scala { .api }
281
class DenseMatrix(
282
val numRows: Int,
283
val numCols: Int,
284
val values: Array[Double],
285
override val isTransposed: Boolean = false
286
) extends Matrix {
287
// Convenience constructor
288
def this(numRows: Int, numCols: Int, values: Array[Double]) =
289
this(numRows, numCols, values, false)
290
}
291
292
class SparseMatrix(
293
val numRows: Int,
294
val numCols: Int,
295
val colPtrs: Array[Int],
296
val rowIndices: Array[Int],
297
val values: Array[Double],
298
override val isTransposed: Boolean = false
299
) extends Matrix {
300
// Convenience constructor
301
def this(numRows: Int, numCols: Int, colPtrs: Array[Int],
302
rowIndices: Array[Int], values: Array[Double]) =
303
this(numRows, numCols, colPtrs, rowIndices, values, false)
304
}
305
```
306
307
### BLAS Operations
308
309
Basic Linear Algebra Subprograms providing optimized implementations of fundamental linear algebra operations.
310
311
```scala { .api }
312
object BLAS {
313
// Level 1: Vector-Vector operations
314
def axpy(a: Double, x: Vector, y: Vector): Unit
315
def dot(x: Vector, y: Vector): Double
316
def copy(x: Vector, y: Vector): Unit
317
def scal(a: Double, x: Vector): Unit
318
319
// Level 2: Matrix-Vector operations
320
def gemv(alpha: Double, A: Matrix, x: Vector, beta: Double, y: DenseVector): Unit
321
def gemv(alpha: Double, A: Matrix, x: Array[Double], beta: Double, y: Array[Double]): Unit
322
def spr(alpha: Double, v: Vector, U: DenseVector): Unit
323
def spr(alpha: Double, v: Vector, U: Array[Double]): Unit
324
def syr(alpha: Double, x: Vector, A: DenseMatrix): Unit
325
def dspmv(n: Int, alpha: Double, A: DenseVector, x: DenseVector, beta: Double, y: DenseVector): Unit
326
327
// Level 3: Matrix-Matrix operations
328
def gemm(alpha: Double, A: Matrix, B: DenseMatrix, beta: Double, C: DenseMatrix): Unit
329
def gemm(alpha: Double, A: Matrix, B: DenseMatrix, beta: Double, CValues: Array[Double]): Unit
330
}
331
```
332
333
**Usage examples:**
334
335
```scala
336
import org.apache.spark.ml.linalg._
337
338
val x = Vectors.dense(1.0, 2.0, 3.0)
339
val y = Vectors.dense(4.0, 5.0, 6.0).copy.asInstanceOf[DenseVector]
340
341
// Level 1 operations
342
val dotProduct = BLAS.dot(x, y) // Compute x · y
343
BLAS.scal(2.0, y) // Scale y by 2.0: y = 2.0 * y
344
BLAS.axpy(1.5, x, y) // Compute y = 1.5 * x + y
345
346
// Level 2 operations
347
val A = Matrices.dense(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
348
val result = Vectors.zeros(2).asInstanceOf[DenseVector]
349
BLAS.gemv(1.0, A, x, 0.0, result) // result = A * x
350
351
// Level 3 operations
352
val B = Matrices.dense(3, 2, Array(1.0, 0.0, 0.0, 1.0, 1.0, 0.0)).asInstanceOf[DenseMatrix]
353
val C = DenseMatrix.zeros(2, 2)
354
BLAS.gemm(1.0, A, B, 0.0, C) // C = A * B
355
```
356
357
## Design Notes
358
359
- **Automatic Format Selection**: The library automatically chooses between dense and sparse formats based on storage efficiency
360
- **Lazy Transposition**: Matrix transposition is performed lazily without copying data
361
- **Breeze Integration**: Built on top of Breeze for optimized scientific computing operations
362
- **Memory Efficiency**: Sparse formats are used when they provide significant memory savings
363
- **Thread Safety**: All vector and matrix types are immutable and thread-safe for read operations
364
- **BLAS Optimization**: BLAS operations leverage optimized native libraries (OpenBLAS, Intel MKL) when available