0
# Matrix Operations
1
2
Matrix data structures and operations for efficient representation and manipulation of 2D numerical data. Supports both dense and sparse representations with column-major and row-major layouts, plus comprehensive linear algebra operations.
3
4
## Capabilities
5
6
### Matrix Trait
7
8
Core abstraction for all matrix types providing common operations, conversions, and linear algebra functionality.
9
10
```scala { .api }
11
sealed trait Matrix extends Serializable {
12
/** Number of rows */
13
def numRows: Int
14
15
/** Number of columns */
16
def numCols: Int
17
18
/** Transpose flag */
19
val isTransposed: Boolean
20
21
/** Converts to dense array in column major */
22
def toArray: Array[Double]
23
24
/** Returns iterator of column vectors */
25
def colIter: Iterator[Vector]
26
27
/** Returns iterator of row vectors */
28
def rowIter: Iterator[Vector]
29
30
/** Gets the (i, j)-th element */
31
def apply(i: Int, j: Int): Double
32
33
/** Deep copy of matrix */
34
def copy: Matrix
35
36
/** Transpose operation */
37
def transpose: Matrix
38
39
/** Matrix-matrix multiplication */
40
def multiply(y: DenseMatrix): DenseMatrix
41
42
/** Matrix-vector multiplication */
43
def multiply(y: Vector): DenseVector
44
45
/** Apply function to all active elements */
46
def foreachActive(f: (Int, Int, Double) => Unit): Unit
47
48
/** Number of nonzero elements */
49
def numNonzeros: Int
50
51
/** Number of active elements */
52
def numActives: Int
53
54
/** Convert to sparse matrix */
55
def toSparse: SparseMatrix
56
57
/** Convert to dense matrix */
58
def toDense: DenseMatrix
59
60
/** Optimal storage format */
61
def compressed: Matrix
62
}
63
```
64
65
### Dense Matrix
66
67
Dense matrix implementation using column-major storage format for efficient linear algebra operations.
68
69
```scala { .api }
70
class DenseMatrix(
71
val numRows: Int,
72
val numCols: Int,
73
val values: Array[Double],
74
override val isTransposed: Boolean
75
) extends Matrix {
76
77
/** Alternative constructor without transpose flag */
78
def this(numRows: Int, numCols: Int, values: Array[Double])
79
80
override def apply(i: Int, j: Int): Double
81
override def copy: DenseMatrix
82
override def transpose: DenseMatrix
83
}
84
```
85
86
Usage example:
87
88
```scala
89
import org.apache.spark.ml.linalg._
90
91
// Create 2x3 dense matrix in column-major order
92
// Values: [1,3,5,2,4,6] represents:
93
// 1 2
94
// 3 4
95
// 5 6
96
val dense = new DenseMatrix(3, 2, Array(1.0, 3.0, 5.0, 2.0, 4.0, 6.0))
97
98
// Access elements
99
val element = dense(1, 0) // 3.0
100
101
// Matrix operations
102
val transposed = dense.transpose
103
val copied = dense.copy
104
105
// Matrix-vector multiplication
106
val vector = Vectors.dense(1.0, 2.0)
107
val result = dense.multiply(vector)
108
```
109
110
### Sparse Matrix
111
112
Sparse matrix implementation using Compressed Sparse Column (CSC) format for memory-efficient storage of matrices with many zero elements.
113
114
```scala { .api }
115
class SparseMatrix(
116
val numRows: Int,
117
val numCols: Int,
118
val colPtrs: Array[Int],
119
val rowIndices: Array[Int],
120
val values: Array[Double],
121
override val isTransposed: Boolean
122
) extends Matrix {
123
124
/** Alternative constructor without transpose flag */
125
def this(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double])
126
127
override def apply(i: Int, j: Int): Double
128
override def copy: SparseMatrix
129
override def transpose: SparseMatrix
130
}
131
```
132
133
Usage example:
134
135
```scala
136
import org.apache.spark.ml.linalg._
137
138
// Create 3x3 sparse matrix:
139
// 1.0 0.0 4.0
140
// 0.0 3.0 5.0
141
// 2.0 0.0 6.0
142
val sparse = new SparseMatrix(
143
numRows = 3,
144
numCols = 3,
145
colPtrs = Array(0, 2, 3, 6), // Column pointers
146
rowIndices = Array(0, 2, 1, 0, 1, 2), // Row indices
147
values = Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0) // Non-zero values
148
)
149
150
// Access elements
151
val element = sparse(0, 2) // 4.0
152
val zeroElement = sparse(0, 1) // 0.0
153
154
// Properties
155
val nnz = sparse.numNonzeros // Number of non-zeros
156
val colPtrs = sparse.colPtrs
157
val rowIndices = sparse.rowIndices
158
```
159
160
### Dense Matrix Factory
161
162
Factory methods for creating and manipulating dense matrices.
163
164
```scala { .api }
165
object DenseMatrix {
166
/** Creates zero matrix */
167
def zeros(numRows: Int, numCols: Int): DenseMatrix
168
169
/** Creates matrix of ones */
170
def ones(numRows: Int, numCols: Int): DenseMatrix
171
172
/** Creates identity matrix */
173
def eye(n: Int): DenseMatrix
174
175
/** Creates random uniform matrix */
176
def rand(numRows: Int, numCols: Int, rng: java.util.Random): DenseMatrix
177
178
/** Creates random Gaussian matrix */
179
def randn(numRows: Int, numCols: Int, rng: java.util.Random): DenseMatrix
180
181
/** Creates diagonal matrix from vector */
182
def diag(vector: Vector): DenseMatrix
183
}
184
```
185
186
Usage examples:
187
188
```scala
189
import org.apache.spark.ml.linalg._
190
import java.util.Random
191
192
// Create special matrices
193
val zeros = DenseMatrix.zeros(3, 3)
194
val ones = DenseMatrix.ones(2, 4)
195
val identity = DenseMatrix.eye(3)
196
197
// Create random matrices
198
val rng = new Random(42)
199
val uniform = DenseMatrix.rand(3, 3, rng)
200
val gaussian = DenseMatrix.randn(3, 3, rng)
201
202
// Create diagonal matrix
203
val vector = Vectors.dense(1.0, 2.0, 3.0)
204
val diagonal = DenseMatrix.diag(vector)
205
```
206
207
### Sparse Matrix Factory
208
209
Factory methods for creating and manipulating sparse matrices with various initialization patterns.
210
211
```scala { .api }
212
object SparseMatrix {
213
/** Creates sparse matrix from COO format */
214
def fromCOO(numRows: Int, numCols: Int, entries: Iterable[(Int, Int, Double)]): SparseMatrix
215
216
/** Creates sparse identity matrix */
217
def speye(n: Int): SparseMatrix
218
219
/** Creates random sparse uniform matrix */
220
def sprand(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): SparseMatrix
221
222
/** Creates random sparse Gaussian matrix */
223
def sprandn(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): SparseMatrix
224
225
/** Creates sparse diagonal matrix from vector */
226
def spdiag(vector: Vector): SparseMatrix
227
}
228
```
229
230
Usage examples:
231
232
```scala
233
import org.apache.spark.ml.linalg._
234
import java.util.Random
235
236
// Create from coordinate (COO) format
237
val entries = Seq((0, 0, 1.0), (1, 1, 2.0), (2, 2, 3.0))
238
val fromCOO = SparseMatrix.fromCOO(3, 3, entries)
239
240
// Create sparse identity
241
val sparseIdentity = SparseMatrix.speye(4)
242
243
// Create random sparse matrices
244
val rng = new Random(42)
245
val sparseUniform = SparseMatrix.sprand(5, 5, 0.3, rng) // 30% density
246
val sparseGaussian = SparseMatrix.sprandn(5, 5, 0.2, rng) // 20% density
247
248
// Create sparse diagonal
249
val vector = Vectors.sparse(3, Array(0, 2), Array(1.0, 3.0))
250
val sparseDiag = SparseMatrix.spdiag(vector)
251
```
252
253
### General Matrix Factory
254
255
General factory object providing unified matrix creation methods that return the Matrix trait.
256
257
```scala { .api }
258
object Matrices {
259
/** Creates dense matrix */
260
def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix
261
262
/** Creates sparse matrix in CSC format */
263
def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix
264
265
/** Creates zero matrix */
266
def zeros(numRows: Int, numCols: Int): Matrix
267
268
/** Creates matrix of ones */
269
def ones(numRows: Int, numCols: Int): Matrix
270
271
/** Creates dense identity matrix */
272
def eye(n: Int): Matrix
273
274
/** Creates sparse identity matrix */
275
def speye(n: Int): Matrix
276
277
/** Creates random uniform matrix */
278
def rand(numRows: Int, numCols: Int, rng: java.util.Random): Matrix
279
280
/** Creates random sparse uniform matrix */
281
def sprand(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): Matrix
282
283
/** Creates diagonal matrix from vector */
284
def diag(vector: Vector): Matrix
285
286
/** Horizontally concatenates matrices */
287
def horzcat(matrices: Array[Matrix]): Matrix
288
289
/** Vertically concatenates matrices */
290
def vertcat(matrices: Array[Matrix]): Matrix
291
}
292
```
293
294
Usage examples:
295
296
```scala
297
import org.apache.spark.ml.linalg._
298
import java.util.Random
299
300
// Create matrices using unified interface
301
val dense = Matrices.dense(2, 2, Array(1.0, 2.0, 3.0, 4.0))
302
val sparse = Matrices.sparse(2, 2, Array(0, 1, 2), Array(0, 1), Array(1.0, 4.0))
303
304
// Matrix concatenation
305
val m1 = Matrices.ones(2, 2)
306
val m2 = Matrices.zeros(2, 2)
307
val horizontal = Matrices.horzcat(Array(m1, m2)) // 2x4 matrix
308
val vertical = Matrices.vertcat(Array(m1, m2)) // 4x2 matrix
309
```
310
311
### Matrix Analysis Operations
312
313
Operations for analyzing matrix properties and optimizing storage format.
314
315
#### Element Counting
316
317
Count active and non-zero elements in the matrix.
318
319
```scala { .api }
320
/** Number of nonzero elements */
321
def numNonzeros: Int
322
323
/** Number of active elements (explicitly stored) */
324
def numActives: Int
325
```
326
327
Usage examples:
328
329
```scala
330
import org.apache.spark.ml.linalg._
331
332
// Dense matrix analysis
333
val dense = Matrices.dense(3, 3, Array(1.0, 0.0, 3.0, 0.0, 5.0, 0.0, 7.0, 0.0, 9.0))
334
val denseNonzeros = dense.numNonzeros // 5 (counts non-zero values)
335
val denseActives = dense.numActives // 9 (all elements stored)
336
337
// Sparse matrix analysis
338
val sparse = Matrices.sparse(3, 3, Array(0, 2, 3, 6), Array(0, 2, 1, 0, 1, 2), Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
339
val sparseNonzeros = sparse.numNonzeros // 6 (only non-zero values)
340
val sparseActives = sparse.numActives // 6 (only stored values)
341
```
342
343
#### Active Element Iteration
344
345
Iterate over all active (explicitly stored) elements in the matrix.
346
347
```scala { .api }
348
/** Apply function to all active elements */
349
def foreachActive(f: (Int, Int, Double) => Unit): Unit
350
```
351
352
Usage examples:
353
354
```scala
355
import org.apache.spark.ml.linalg._
356
357
val matrix = Matrices.sparse(3, 3, Array(0, 2, 3, 5), Array(0, 2, 1, 0, 2), Array(1.0, 2.0, 3.0, 4.0, 5.0))
358
359
// Print all active elements with their positions
360
matrix.foreachActive { (row, col, value) =>
361
println(s"matrix($row, $col) = $value")
362
}
363
364
// Compute sum of all active elements
365
var sum = 0.0
366
matrix.foreachActive { (row, col, value) =>
367
sum += value
368
}
369
println(s"Sum of active elements: $sum")
370
371
// Find maximum element and its position
372
var maxValue = Double.NegativeInfinity
373
var maxRow = -1
374
var maxCol = -1
375
matrix.foreachActive { (row, col, value) =>
376
if (value > maxValue) {
377
maxValue = value
378
maxRow = row
379
maxCol = col
380
}
381
}
382
```
383
384
#### Optimal Storage Format
385
386
Automatically choose the most memory-efficient storage format.
387
388
```scala { .api }
389
/** Optimal storage format based on sparsity */
390
def compressed: Matrix
391
```
392
393
Usage examples:
394
395
```scala
396
import org.apache.spark.ml.linalg._
397
398
// Sparse matrix with few elements stays sparse
399
val sparse = Matrices.sparse(100, 100, Array(0, 1, 2), Array(0, 50), Array(1.0, 2.0))
400
val compressedSparse = sparse.compressed // Remains SparseMatrix
401
402
// Dense matrix with many zeros converts to sparse (if beneficial)
403
val mostlyZeros = Matrices.dense(4, 4, Array(
404
1.0, 0.0, 0.0, 0.0,
405
0.0, 0.0, 0.0, 0.0,
406
0.0, 0.0, 0.0, 0.0,
407
0.0, 0.0, 0.0, 2.0
408
))
409
val compressedFromDense = mostlyZeros.compressed // May become SparseMatrix
410
411
// Dense matrix with many non-zeros stays dense
412
val mostlyNonZeros = Matrices.dense(3, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0))
413
val compressedDense = mostlyNonZeros.compressed // Remains DenseMatrix
414
```
415
416
## Type Definitions
417
418
```scala { .api }
419
sealed trait Matrix extends Serializable
420
421
class DenseMatrix(
422
val numRows: Int,
423
val numCols: Int,
424
val values: Array[Double],
425
override val isTransposed: Boolean = false
426
) extends Matrix
427
428
class SparseMatrix(
429
val numRows: Int,
430
val numCols: Int,
431
val colPtrs: Array[Int],
432
val rowIndices: Array[Int],
433
val values: Array[Double],
434
override val isTransposed: Boolean = false
435
) extends Matrix
436
```