0
# Matrix Operations
1
2
Matrix operations supporting dense and sparse formats with efficient multiplication, transposition, and format conversion. Matrices support both column-major and row-major layouts with automatic optimization.
3
4
## Capabilities
5
6
### Matrix Trait
7
8
Base trait for all matrix types providing common operations and conversions.
9
10
```scala { .api }
11
/**
12
* Sealed trait for local matrices
13
*/
14
sealed trait Matrix extends Serializable {
15
/** Number of rows */
16
def numRows: Int
17
18
/** Number of columns */
19
def numCols: Int
20
21
/** Whether the matrix is transposed */
22
val isTransposed: Boolean = false
23
24
/** Gets the (i, j)-th element */
25
def apply(i: Int, j: Int): Double
26
27
/** Get a deep copy of the matrix */
28
def copy: Matrix
29
30
/** Transpose the matrix (returns new instance sharing data) */
31
def transpose: Matrix
32
33
/** Matrix-matrix multiplication */
34
def multiply(y: DenseMatrix): DenseMatrix
35
36
/** Matrix-vector multiplication */
37
def multiply(y: Vector): DenseVector
38
39
/** Converts to dense array in column major order */
40
def toArray: Array[Double]
41
42
/** Returns iterator of column vectors */
43
def colIter: Iterator[Vector]
44
45
/** Returns iterator of row vectors */
46
def rowIter: Iterator[Vector]
47
48
/** Apply function to all active (stored) elements */
49
def foreachActive(f: (Int, Int, Double) => Unit): Unit
50
51
/** Number of non-zero values */
52
def numNonzeros: Int
53
54
/** Number of explicitly stored values */
55
def numActives: Int
56
57
/** Convert to sparse matrix */
58
def toSparse: SparseMatrix
59
60
/** Convert to dense matrix */
61
def toDense: DenseMatrix
62
63
/** Returns matrix in optimal format using less storage */
64
def compressed: Matrix
65
}
66
```
67
68
**Usage Examples:**
69
70
```scala
71
import org.apache.spark.ml.linalg.{Matrices, DenseMatrix, Vector, Vectors}
72
73
// Matrix creation and basic operations
74
val matrix = Matrices.dense(3, 2, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
75
// Represents: [[1.0, 4.0],
76
// [2.0, 5.0],
77
// [3.0, 6.0]]
78
79
println(matrix.numRows) // 3
80
println(matrix.numCols) // 2
81
println(matrix(1, 0)) // 2.0
82
83
// Matrix operations
84
val transposed = matrix.transpose
85
val vector = Vectors.dense(1.0, 2.0)
86
val matVecResult = matrix.multiply(vector) // Matrix-vector multiplication
87
88
// Conversions
89
val dense = matrix.toDense
90
val sparse = matrix.toSparse
91
val compressed = matrix.compressed // Chooses optimal format
92
```
93
94
### Dense Matrix
95
96
Column-major dense matrix storing all elements explicitly in a single array.
97
98
```scala { .api }
99
/**
100
* Column-major dense matrix
101
* @param numRows number of rows
102
* @param numCols number of columns
103
* @param values matrix entries in column major order
104
* @param isTransposed whether the matrix is transposed
105
*/
106
class DenseMatrix(
107
val numRows: Int,
108
val numCols: Int,
109
val values: Array[Double],
110
override val isTransposed: Boolean) extends Matrix {
111
112
/** Column-major constructor (isTransposed = false) */
113
def this(numRows: Int, numCols: Int, values: Array[Double]) =
114
this(numRows, numCols, values, false)
115
116
override def apply(i: Int, j: Int): Double = values(index(i, j))
117
override def copy: DenseMatrix = new DenseMatrix(numRows, numCols, values.clone())
118
override def transpose: DenseMatrix = new DenseMatrix(numCols, numRows, values, !isTransposed)
119
override def numActives: Int = values.length
120
override def numNonzeros: Int = values.count(_ != 0)
121
}
122
123
object DenseMatrix {
124
/** Generate matrix of zeros */
125
def zeros(numRows: Int, numCols: Int): DenseMatrix
126
127
/** Generate matrix of ones */
128
def ones(numRows: Int, numCols: Int): DenseMatrix
129
130
/** Generate identity matrix */
131
def eye(n: Int): DenseMatrix
132
133
/** Generate random matrix with uniform values U(0,1) */
134
def rand(numRows: Int, numCols: Int, rng: java.util.Random): DenseMatrix
135
136
/** Generate random matrix with Gaussian values N(0,1) */
137
def randn(numRows: Int, numCols: Int, rng: java.util.Random): DenseMatrix
138
139
/** Generate diagonal matrix from vector */
140
def diag(vector: Vector): DenseMatrix
141
}
142
```
143
144
**Usage Examples:**
145
146
```scala
147
import org.apache.spark.ml.linalg.{DenseMatrix, Vectors}
148
import java.util.Random
149
150
// Matrix creation
151
val matrix = new DenseMatrix(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
152
// Layout: [[1.0, 3.0, 5.0],
153
// [2.0, 4.0, 6.0]]
154
155
// Factory methods
156
val zeros = DenseMatrix.zeros(3, 3)
157
val ones = DenseMatrix.ones(2, 4)
158
val identity = DenseMatrix.eye(5)
159
val random = DenseMatrix.rand(3, 3, new Random(42))
160
val gaussian = DenseMatrix.randn(2, 2, new Random(42))
161
162
// Diagonal matrix from vector
163
val diag = DenseMatrix.diag(Vectors.dense(1.0, 2.0, 3.0))
164
// Results in: [[1.0, 0.0, 0.0],
165
// [0.0, 2.0, 0.0],
166
// [0.0, 0.0, 3.0]]
167
168
// Transpose
169
val transposed = matrix.transpose
170
println(transposed.numRows) // 3
171
println(transposed.numCols) // 2
172
```
173
174
### Sparse Matrix
175
176
Column-major sparse matrix in Compressed Sparse Column (CSC) format.
177
178
```scala { .api }
179
/**
180
* Column-major sparse matrix in CSC format
181
* @param numRows number of rows
182
* @param numCols number of columns
183
* @param colPtrs column pointers (length numCols + 1)
184
* @param rowIndices row indices of non-zero elements
185
* @param values non-zero values
186
* @param isTransposed whether matrix is transposed
187
*/
188
class SparseMatrix(
189
val numRows: Int,
190
val numCols: Int,
191
val colPtrs: Array[Int],
192
val rowIndices: Array[Int],
193
val values: Array[Double],
194
override val isTransposed: Boolean) extends Matrix {
195
196
/** Column-major constructor (isTransposed = false) */
197
def this(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]) =
198
this(numRows, numCols, colPtrs, rowIndices, values, false)
199
200
override def apply(i: Int, j: Int): Double // Binary search in CSC structure
201
override def copy: SparseMatrix = new SparseMatrix(numRows, numCols, colPtrs, rowIndices, values.clone())
202
override def transpose: SparseMatrix = new SparseMatrix(numCols, numRows, colPtrs, rowIndices, values, !isTransposed)
203
override def numActives: Int = values.length
204
override def numNonzeros: Int = values.count(_ != 0)
205
}
206
207
object SparseMatrix {
208
/** Create from Coordinate List (COO) format */
209
def fromCOO(numRows: Int, numCols: Int, entries: Iterable[(Int, Int, Double)]): SparseMatrix
210
211
/** Generate sparse identity matrix */
212
def speye(n: Int): SparseMatrix
213
214
/** Generate sparse random matrix with uniform values */
215
def sprand(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): SparseMatrix
216
217
/** Generate sparse random matrix with Gaussian values */
218
def sprandn(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): SparseMatrix
219
220
/** Generate sparse diagonal matrix from vector */
221
def spdiag(vector: Vector): SparseMatrix
222
}
223
```
224
225
**Usage Examples:**
226
227
```scala
228
import org.apache.spark.ml.linalg.{SparseMatrix, Vectors}
229
import java.util.Random
230
231
// CSC format for matrix:
232
// [[1.0, 0.0, 4.0],
233
// [0.0, 3.0, 5.0],
234
// [2.0, 0.0, 6.0]]
235
val sparse = new SparseMatrix(
236
3, 3, // 3x3 matrix
237
Array(0, 2, 3, 6), // colPtrs: col 0 starts at 0, col 1 at 2, col 2 at 3, end at 6
238
Array(0, 2, 1, 0, 1, 2), // rowIndices: which rows have values
239
Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0) // values: the actual non-zero values
240
)
241
242
// COO format creation (easier)
243
val entries = Seq(
244
(0, 0, 1.0), (2, 0, 2.0), // Column 0: (0,0)=1.0, (2,0)=2.0
245
(1, 1, 3.0), // Column 1: (1,1)=3.0
246
(0, 2, 4.0), (1, 2, 5.0), (2, 2, 6.0) // Column 2: values
247
)
248
val sparseFromCOO = SparseMatrix.fromCOO(3, 3, entries)
249
250
// Factory methods
251
val identity = SparseMatrix.speye(5) // 5x5 sparse identity
252
val randomSparse = SparseMatrix.sprand(10, 10, 0.1, new Random(42)) // 10% density
253
val diagonalSparse = SparseMatrix.spdiag(Vectors.sparse(5, Array(0, 2, 4), Array(1.0, 2.0, 3.0)))
254
255
// Access elements
256
println(sparse(0, 0)) // 1.0
257
println(sparse(0, 1)) // 0.0 (not stored)
258
println(sparse(1, 1)) // 3.0
259
```
260
261
### Matrix Factory
262
263
Factory methods for creating matrices with various patterns and formats.
264
265
```scala { .api }
266
/**
267
* Factory methods for Matrix creation
268
*/
269
object Matrices {
270
/** Create column-major dense matrix */
271
def dense(numRows: Int, numCols: Int, values: Array[Double]): Matrix
272
273
/** Create sparse matrix in CSC format */
274
def sparse(numRows: Int, numCols: Int, colPtrs: Array[Int], rowIndices: Array[Int], values: Array[Double]): Matrix
275
276
/** Generate matrix of zeros (dense) */
277
def zeros(numRows: Int, numCols: Int): Matrix
278
279
/** Generate matrix of ones (dense) */
280
def ones(numRows: Int, numCols: Int): Matrix
281
282
/** Generate dense identity matrix */
283
def eye(n: Int): Matrix
284
285
/** Generate sparse identity matrix */
286
def speye(n: Int): Matrix
287
288
/** Generate dense random matrix with uniform values */
289
def rand(numRows: Int, numCols: Int, rng: java.util.Random): Matrix
290
291
/** Generate sparse random matrix with uniform values */
292
def sprand(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): Matrix
293
294
/** Generate dense random matrix with Gaussian values */
295
def randn(numRows: Int, numCols: Int, rng: java.util.Random): Matrix
296
297
/** Generate sparse random matrix with Gaussian values */
298
def sprandn(numRows: Int, numCols: Int, density: Double, rng: java.util.Random): Matrix
299
300
/** Generate diagonal matrix from vector (dense) */
301
def diag(vector: Vector): Matrix
302
303
/** Horizontally concatenate matrices */
304
def horzcat(matrices: Array[Matrix]): Matrix
305
306
/** Vertically concatenate matrices */
307
def vertcat(matrices: Array[Matrix]): Matrix
308
}
309
```
310
311
**Usage Examples:**
312
313
```scala
314
import org.apache.spark.ml.linalg.{Matrices, Vectors}
315
import java.util.Random
316
317
// Basic matrix creation
318
val dense = Matrices.dense(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
319
val sparse = Matrices.sparse(3, 3, Array(0, 1, 2, 3), Array(0, 1, 2), Array(1.0, 2.0, 3.0))
320
321
// Special matrices
322
val zeros = Matrices.zeros(3, 4)
323
val ones = Matrices.ones(2, 2)
324
val identity = Matrices.eye(5)
325
val sparseIdentity = Matrices.speye(5)
326
327
// Random matrices
328
val rng = new Random(42)
329
val randomDense = Matrices.rand(3, 3, rng)
330
val randomSparse = Matrices.sprand(5, 5, 0.2, rng) // 20% density
331
val gaussianDense = Matrices.randn(2, 3, rng)
332
val gaussianSparse = Matrices.sprandn(4, 4, 0.3, rng)
333
334
// Diagonal matrix
335
val diagonal = Matrices.diag(Vectors.dense(1.0, 2.0, 3.0, 4.0))
336
337
// Matrix concatenation
338
val m1 = Matrices.dense(2, 2, Array(1.0, 2.0, 3.0, 4.0))
339
val m2 = Matrices.dense(2, 2, Array(5.0, 6.0, 7.0, 8.0))
340
val horizontal = Matrices.horzcat(Array(m1, m2)) // 2x4 matrix
341
val vertical = Matrices.vertcat(Array(m1, m2)) // 4x2 matrix
342
```
343
344
### Matrix Operations
345
346
Advanced matrix operations and conversions.
347
348
```scala { .api }
349
// Format conversions with layout control
350
def toSparseColMajor: SparseMatrix // Convert to sparse column-major
351
def toSparseRowMajor: SparseMatrix // Convert to sparse row-major
352
def toDenseColMajor: DenseMatrix // Convert to dense column-major
353
def toDenseRowMajor: DenseMatrix // Convert to dense row-major
354
355
// Automatic format optimization
356
def compressed: Matrix // Optimal format based on sparsity
357
def compressedColMajor: Matrix // Optimal column-major format
358
def compressedRowMajor: Matrix // Optimal row-major format
359
360
361
// String representation with limits
362
def toString(maxLines: Int, maxLineWidth: Int): String
363
```
364
365
**Usage Examples:**
366
367
```scala
368
val matrix = Matrices.rand(1000, 1000, new Random(42))
369
370
// Format conversions - choose based on usage patterns
371
val colMajorSparse = matrix.toSparseColMajor // Good for column operations
372
val rowMajorSparse = matrix.toSparseRowMajor // Good for row operations
373
val colMajorDense = matrix.toDenseColMajor // Standard dense format
374
val rowMajorDense = matrix.toDenseRowMajor // Transposed layout
375
376
// Automatic optimization based on sparsity
377
val optimal = matrix.compressed // Chooses dense or sparse
378
val optimalCol = matrix.compressedColMajor // Best column-major format
379
val optimalRow = matrix.compressedRowMajor // Best row-major format
380
381
// Automatic optimization based on sparsity
382
383
// Controlled string output for large matrices
384
val limitedString = matrix.toString(10, 100) // Max 10 lines, 100 chars width
385
```
386
387
## Matrix Storage Formats
388
389
### Column-Major Layout (Default)
390
391
Elements stored column by column in a single array:
392
```
393
Matrix: [[1, 3, 5], Array: [1, 2, 3, 4, 5, 6]
394
[2, 4, 6]]
395
```
396
397
### CSC (Compressed Sparse Column) Format
398
399
Efficient storage for sparse matrices:
400
- `values`: Non-zero values in column order
401
- `rowIndices`: Row indices for each value
402
- `colPtrs`: Start positions of each column
403
404
```scala
405
// Matrix: [[1, 0, 4],
406
// [0, 2, 0],
407
// [3, 0, 5]]
408
val values = Array(1.0, 3.0, 2.0, 4.0, 5.0) // Non-zero values
409
val rowIndices = Array(0, 2, 1, 0, 2) // Their row positions
410
val colPtrs = Array(0, 2, 3, 5) // Column start positions
411
```
412
413
## Error Handling
414
415
Matrix operations validate dimensions and throw appropriate exceptions:
416
417
```scala
418
// Dimension validation
419
val m1 = Matrices.dense(2, 3, Array(1,2,3,4,5,6))
420
val m2 = Matrices.dense(2, 2, Array(1,2,3,4))
421
m1.multiply(m2) // IllegalArgumentException: dimension mismatch
422
423
// Index bounds checking
424
val matrix = Matrices.eye(3)
425
matrix(5, 0) // IndexOutOfBoundsException: row index out of range
426
427
```