0
# Linear Algebra Operations
1
2
Optimized Basic Linear Algebra Subprograms (BLAS) providing level-1, level-2, and level-3 operations with automatic native acceleration. The BLAS object itself is internal (`private[spark] object BLAS`), but its operations are accessible through public Vector and Matrix APIs.
3
4
**Note**: While the BLAS object is not directly accessible to users, its functionality is exposed through:
5
- Vector operations: `vector1.dot(vector2)` uses `BLAS.dot(vector1, vector2)`
6
- Matrix operations: `matrix.multiply(vector)` uses `BLAS.gemv`
7
- Matrix multiplication: `matrix.multiply(matrix)` uses `BLAS.gemm`
8
9
The methods documented here show the underlying BLAS API signatures for reference.
10
11
## Capabilities
12
13
### Level-1 BLAS (Vector-Vector Operations)
14
15
Operations between vectors with O(n) complexity.
16
17
```scala { .api }
18
object BLAS {
19
/**
20
* Vector addition: y += a * x
21
* @param a Scalar multiplier
22
* @param x Source vector
23
* @param y Target vector (modified in-place, must be dense)
24
*/
25
def axpy(a: Double, x: Vector, y: Vector): Unit
26
27
/**
28
* Dot product: x^T * y
29
* @param x First vector
30
* @param y Second vector (must have same size as x)
31
* @return Dot product result
32
*/
33
def dot(x: Vector, y: Vector): Double
34
35
/**
36
* Vector copy: y = x
37
* @param x Source vector
38
* @param y Target vector (modified in-place, must be dense)
39
*/
40
def copy(x: Vector, y: Vector): Unit
41
42
/**
43
* Vector scaling: x *= a
44
* @param a Scalar multiplier
45
* @param x Vector to scale (modified in-place)
46
*/
47
def scal(a: Double, x: Vector): Unit
48
49
/**
50
* Matrix addition: Y += a * X (for dense matrices)
51
* @param a Scalar multiplier
52
* @param X Source matrix
53
* @param Y Target matrix (modified in-place)
54
*/
55
private[spark] def axpy(a: Double, X: DenseMatrix, Y: DenseMatrix): Unit
56
}
57
```
58
59
**Usage Examples:**
60
61
```scala
62
import org.apache.spark.ml.linalg._
63
64
val x = Vectors.dense(1.0, 2.0, 3.0)
65
val y = Vectors.dense(4.0, 5.0, 6.0)
66
val target = Vectors.dense(0.0, 0.0, 0.0)
67
68
// Dot product
69
val dotProduct = BLAS.dot(x, y) // 32.0
70
71
// Vector addition: target += 2.0 * x
72
BLAS.axpy(2.0, x, target) // target becomes [2.0, 4.0, 6.0]
73
74
// Vector copy
75
BLAS.copy(y, target) // target becomes [4.0, 5.0, 6.0]
76
77
// Vector scaling
78
BLAS.scal(0.5, target) // target becomes [2.0, 2.5, 3.0]
79
```
80
81
### Level-2 BLAS (Matrix-Vector Operations)
82
83
Operations between matrices and vectors with O(n²) complexity.
84
85
```scala { .api }
86
object BLAS {
87
/**
88
* General matrix-vector multiplication: y := alpha * A * x + beta * y
89
* @param alpha Scalar multiplier for A * x
90
* @param A Matrix operand
91
* @param x Vector operand (size must match A.numCols)
92
* @param beta Scalar multiplier for y
93
* @param y Result vector (modified in-place, size must match A.numRows)
94
*/
95
def gemv(
96
alpha: Double,
97
A: Matrix,
98
x: Vector,
99
beta: Double,
100
y: DenseVector
101
): Unit
102
103
/**
104
* General matrix-vector multiplication with array storage: y[0:A.numRows] := alpha * A * x[0:A.numCols] + beta * y[0:A.numRows]
105
* @param alpha Scalar multiplier for A * x
106
* @param A Matrix operand
107
* @param x Vector values as array (size must be >= A.numCols)
108
* @param beta Scalar multiplier for y
109
* @param y Result array (modified in-place, size must be >= A.numRows)
110
*/
111
def gemv(
112
alpha: Double,
113
A: Matrix,
114
x: Array[Double],
115
beta: Double,
116
y: Array[Double]
117
): Unit
118
119
/**
120
* Symmetric matrix-vector multiplication: y := alpha*A*x + beta*y
121
* where A is symmetric and stored in packed format
122
* @param n Order of matrix A
123
* @param alpha Scalar multiplier for A * x
124
* @param A Symmetric matrix in packed format (upper triangular)
125
* @param x Vector operand
126
* @param beta Scalar multiplier for y
127
* @param y Result vector (modified in-place)
128
*/
129
def dspmv(
130
n: Int,
131
alpha: Double,
132
A: DenseVector,
133
x: DenseVector,
134
beta: Double,
135
y: DenseVector
136
): Unit
137
138
/**
139
* Symmetric rank-1 update: A := alpha * x * x^T + A
140
* @param alpha Scalar multiplier
141
* @param x Vector operand
142
* @param A Symmetric matrix (modified in-place)
143
*/
144
def syr(alpha: Double, x: Vector, A: DenseMatrix): Unit
145
146
/**
147
* Symmetric packed rank-1 update: U += alpha * v * v^T
148
* where U is upper triangular part stored in packed format
149
* @param alpha Scalar multiplier
150
* @param v Vector operand
151
* @param U Upper triangular matrix in packed format (modified in-place)
152
*/
153
def spr(alpha: Double, v: Vector, U: DenseVector): Unit
154
}
155
```
156
157
**Usage Examples:**
158
159
```scala
160
import org.apache.spark.ml.linalg._
161
162
val A = Matrices.dense(3, 3, Array(
163
1.0, 0.0, 0.0, // column 1
164
0.0, 2.0, 0.0, // column 2
165
0.0, 0.0, 3.0 // column 3
166
))
167
val x = Vectors.dense(1.0, 2.0, 3.0)
168
val y = Vectors.dense(0.0, 0.0, 0.0).asInstanceOf[DenseVector]
169
170
// Matrix-vector multiplication: y = 1.0 * A * x + 0.0 * y
171
BLAS.gemv(1.0, A, x, 0.0, y) // y becomes [1.0, 4.0, 9.0]
172
173
// Symmetric rank-1 update
174
val symmetric = DenseMatrix.eye(3)
175
BLAS.syr(1.0, x, symmetric) // Updates symmetric matrix
176
```
177
178
### Level-3 BLAS (Matrix-Matrix Operations)
179
180
Operations between matrices with O(n³) complexity.
181
182
```scala { .api }
183
object BLAS {
184
/**
185
* General matrix-matrix multiplication: C := alpha * A * B + beta * C
186
* @param alpha Scalar multiplier for A * B
187
* @param A Left matrix operand
188
* @param B Right matrix operand (must be DenseMatrix, A.numCols must equal B.numRows)
189
* @param beta Scalar multiplier for C
190
* @param C Result matrix (modified in-place, must be DenseMatrix)
191
*/
192
def gemm(
193
alpha: Double,
194
A: Matrix,
195
B: DenseMatrix,
196
beta: Double,
197
C: DenseMatrix
198
): Unit
199
200
/**
201
* General matrix-matrix multiplication with array storage: CValues[0: A.numRows * B.numCols] := alpha * A * B + beta * CValues[0: A.numRows * B.numCols]
202
* @param alpha Scalar multiplier for A * B
203
* @param A Left matrix operand
204
* @param B Right matrix operand (must be DenseMatrix, A.numCols must equal B.numRows)
205
* @param beta Scalar multiplier for CValues
206
* @param CValues Result array (modified in-place, must have length >= A.numRows * B.numCols)
207
*/
208
def gemm(
209
alpha: Double,
210
A: Matrix,
211
B: DenseMatrix,
212
beta: Double,
213
CValues: Array[Double]
214
): Unit
215
}
216
```
217
218
**Usage Examples:**
219
220
```scala
221
import org.apache.spark.ml.linalg._
222
223
val A = Matrices.dense(2, 3, Array(1.0, 2.0, 3.0, 4.0, 5.0, 6.0))
224
val B = DenseMatrix.ones(3, 2)
225
val C = DenseMatrix.zeros(2, 2)
226
227
// Matrix multiplication: C = 1.0 * A * B + 0.0 * C
228
BLAS.gemm(1.0, A, B, 0.0, C)
229
// Result: C contains the product A * B
230
```
231
232
### Specialized Operations
233
234
Additional BLAS operations for specific use cases.
235
236
```scala { .api }
237
object BLAS {
238
/**
239
* Symmetric packed rank-1 update with array storage
240
* @param alpha Scalar multiplier
241
* @param v Vector operand
242
* @param U Upper triangular matrix in packed array format (modified in-place)
243
*/
244
def spr(alpha: Double, v: Vector, U: Array[Double]): Unit
245
}
246
```
247
248
## Implementation Details
249
250
### Native Acceleration
251
252
The BLAS implementation automatically uses the best available backend:
253
254
```scala { .api }
255
object BLAS {
256
/**
257
* Get Java-based BLAS implementation
258
* @return NetlibBLAS instance using pure Java
259
*/
260
private[spark] def javaBLAS: dev.ludovic.netlib.blas.BLAS
261
262
/**
263
* Get native BLAS implementation with fallback
264
* @return NetlibBLAS instance using native libraries or Java fallback
265
*/
266
private[spark] def nativeBLAS: dev.ludovic.netlib.blas.BLAS
267
268
/**
269
* Choose optimal BLAS implementation based on vector size
270
* @param vectorSize Size of vectors in operation
271
* @return Appropriate BLAS implementation
272
*/
273
private[spark] def getBLAS(vectorSize: Int): dev.ludovic.netlib.blas.BLAS
274
}
275
```
276
277
### Backend Selection
278
279
1. **Native BLAS**: Uses Intel MKL, OpenBLAS, or other optimized libraries when available
280
2. **Java BLAS**: Pure JVM implementation as fallback
281
3. **Automatic selection**: Chooses Java BLAS for small operations (< 256 elements) for better performance
282
4. **Level-1 optimization**: Uses Java BLAS for `dspmv` operation regardless of size
283
284
### Performance Characteristics
285
286
- **Level-1 operations**: O(n) complexity, optimized for small to medium vectors
287
- **Level-2 operations**: O(n²) complexity, benefits significantly from native acceleration
288
- **Level-3 operations**: O(n³) complexity, requires native BLAS for good performance
289
- **Sparse operations**: Custom implementations optimized for sparse data structures
290
- **Memory access**: Column-major ordering for compatibility with Fortran BLAS
291
292
### Supported Vector/Matrix Combinations
293
294
```scala
295
// Supported combinations for different operations:
296
297
// axpy: x can be Dense/Sparse, y must be Dense
298
BLAS.axpy(1.0, denseX, denseY) // ✓
299
BLAS.axpy(1.0, sparseX, denseY) // ✓
300
BLAS.axpy(1.0, denseX, sparseY) // ✗ (throws exception)
301
302
// dot: both x and y can be Dense/Sparse
303
BLAS.dot(denseX, denseY) // ✓
304
BLAS.dot(sparseX, denseY) // ✓
305
BLAS.dot(denseX, sparseY) // ✓
306
BLAS.dot(sparseX, sparseY) // ✓
307
308
// gemv: A can be Dense/Sparse, x can be Dense/Sparse, y must be Dense
309
BLAS.gemv(1.0, denseA, denseX, 0.0, denseY) // ✓
310
BLAS.gemv(1.0, sparseA, sparseX, 0.0, denseY) // ✓
311
312
// gemm: A can be Dense/Sparse, B and C must be Dense
313
BLAS.gemm(1.0, sparseA, denseB, 0.0, denseC) // ✓
314
BLAS.gemm(1.0, denseA, sparseB, 0.0, denseC) // ✗ (throws exception)
315
```
316
317
### Error Handling
318
319
BLAS operations validate inputs and throw exceptions for invalid parameters:
320
321
- `IllegalArgumentException`: Dimension mismatches or invalid parameter values
322
- `UnsupportedOperationException`: Unsupported vector/matrix type combinations
323
324
**Example Error Cases:**
325
326
```scala
327
// Dimension mismatch
328
val x = Vectors.dense(1.0, 2.0)
329
val y = Vectors.dense(1.0, 2.0, 3.0)
330
BLAS.dot(x, y) // throws IllegalArgumentException
331
332
// Unsupported type combination
333
val sparse = Vectors.sparse(3, Array(0, 2), Array(1.0, 3.0))
334
BLAS.axpy(1.0, sparse, sparse) // throws IllegalArgumentException (y must be dense)
335
```
336
337
## Integration with Other Components
338
339
BLAS operations are used internally by:
340
341
- Vector dot product: `vector.dot(other)` calls `BLAS.dot(vector, other)`
342
- Matrix multiplication: `matrix.multiply(vector)` uses `BLAS.gemv`
343
- Matrix multiplication: `matrix.multiply(matrix)` uses `BLAS.gemm`
344
- Statistical computations in MultivariateGaussian
345
- Optimization algorithms throughout Spark MLlib