SVD分为RowMatrix SVD API和IndexedRowMatrix SVD API两大类模型接口。
模型接口类别 |
函数接口 |
---|---|
MLlib RowMatrix API |
def computeSVD( k: Int, computeU: Boolean = false, rCond: Double = 1e-9): SingularValueDecomposition[RowMatrix,Matrix] |
MLlib IndexedRowMatrix API |
def computeSVD( k: Int, computeU: Boolean = false, rCond: Double = 1e-9):SingularValueDecomposition[IndexedRowMatrix, Matrix] |
Param name |
Type(s) |
Description |
---|---|---|
rows |
RDD[Vector] |
矩阵,以行为单位进行存储 |
nRows |
Long |
行数 |
nCols |
Int |
列数 |
Param name |
Type(s) |
Default |
Description |
---|---|---|---|
k |
Int |
- |
要求的奇异值个数,取值范围[1,n] |
computeU |
Boolean |
false |
是否计算左奇异矩阵 |
rCond |
Double |
1e-9 |
矩阵条件数倒数,超过rCond*s[0]的值被当做0 |
参数及RowMatrix代码接口示例:
val matrix = new RowMatrix(trainingData, params.numRows, params.numCols) //行矩阵实例 // 调用行矩阵的computeSVD接口 val svd = matrix.computeSVD(params.k, computeU = true)
Param name |
Type(s) |
Description |
---|---|---|
U |
RowMatrix |
左奇异矩阵,矩阵大小是m×k |
s |
Vector |
奇异值向量,向量长度为k |
V |
Matrix |
右奇异矩阵,矩阵大小是n×k |
import org.apache.spark.mllib.linalg.Matrix import org.apache.spark.mllib.linalg.SingularValueDecomposition import org.apache.spark.mllib.linalg.Vector import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.mllib.linalg.distributed.RowMatrix val data = Array( Vectors.sparse(5, Seq((1, 1.0), (3, 7.0))), Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0), Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0)) val rows = sc.parallelize(data) val mat: RowMatrix = new RowMatrix(rows) // Compute the top 5 singular values and corresponding singular vectors. val svd: SingularValueDecomposition[RowMatrix, Matrix] = mat.computeSVD(5, computeU = true) val U: RowMatrix = svd.U // The U factor is a RowMatrix. val s: Vector = svd.s // The singular values are stored in a local dense vector. val V: Matrix = svd.V // The V factor is a local dense matrix.
Param name |
Type(s) |
Description |
---|---|---|
rows |
RDD[IndexedRow] |
矩阵,以行为单位进行存储; IndexedRow(index:Long, vector: Vector) |
nRows |
Long |
行数 |
nCols |
Int |
列数 |
Param name |
Type(s) |
Default |
Description |
---|---|---|---|
k |
Int |
- |
要求的奇异值个数,取值范围[1,n] |
computeU |
Boolean |
false |
是否计算左奇异矩阵 |
rCond |
Double |
1e-9 |
矩阵条件数倒数,超过rCond*s[0]的值被当做0 |
参数及IndexedRowMatrix代码接口示例:
val indexedMatrix = new IndexedRowMatrix(trainingData, params.numRows, params.numCols) //索引行矩阵实例 // 调用索引行矩阵的computeSVD接口 val svd = indexedMatrix.computeSVD(params.k, computeU = true)
Param name |
Type(s) |
Description |
---|---|---|
U |
IndexedRowMatrix |
左奇异矩阵,矩阵大小是m×k |
s |
Vector |
奇异值向量,向量长度为k |
V |
Matrix |
右奇异矩阵,矩阵大小是n×k |
val svdRes = distMatrix.computeSVD(k)