我要评分
获取效率
正确性
完整性
易理解

SVD

There are RowMatrix SVD APIs and IndexedRowMatrix SVD APIs for the SVD algorithm.

Model API Type

Function API

MLlib RowMatrix API

def computeSVD(

k: Int,

computeU: Boolean = false,

rCond: Double = 1e-9): SingularValueDecomposition[RowMatrix,Matrix]

MLlib IndexedRowMatrix API

def computeSVD(

k: Int,

computeU: Boolean = false,

rCond: Double = 1e-9):SingularValueDecomposition[IndexedRowMatrix, Matrix]

MLlib RowMatrix API

  • Function

    Input the matrix in the RDD[Vector] form and output its singular value decomposition result.

  • Input and output
    1. Package name: package org.apache.spark.mllib.linalg.distributed
    2. Class name: RowMatrix
    3. Method name: computeSVD
    4. Input: matrix (RowMatrix)

      Parameter

      Type

      Description

      rows

      RDD[Vector]

      Matrix, which is stored by row

      nRows

      Long

      Number of rows

      nCols

      Int

      Number of columns

    5. Algorithm parameters

      Parameter

      Type

      Default Value

      Description

      k

      Int

      -

      Number of singular values. The value ranges from 1 to n.

      computeU

      Boolean

      false

      Whether to calculate the left singular matrix

      rCond

      Double

      1e-9

      Reciprocal of the number of matrix conditions. If the parameter value exceeds the value of rCond*s[0], the parameter value is considered 0.

      An example is provided as follows:

      1
      2
      3
      4
      val matrix = new RowMatrix(trainingData, params.numRows, params.numCols) // Row matrix instance
      
      // Call the computeSVD API of the row matrix.
      val svd = matrix.computeSVD(params.k, computeU = true)
      
    6. Output: SVD decomposition result SingularValueDecomposition[RowMatrix, Matrix]. SingularValueDecomposition is a case class that contains three variables U, s, and V.

      Parameter

      Type

      Description

      U

      RowMatrix

      Left singular matrix. The matrix size is m x k.

      s

      Vector

      Singular value vector. The vector length is k.

      V

      Matrix

      Right singular matrix. The matrix size is n x k.

  • Sample usage
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    import org.apache.spark.mllib.linalg.Matrix
    import org.apache.spark.mllib.linalg.SingularValueDecomposition
    import org.apache.spark.mllib.linalg.Vector
    import org.apache.spark.mllib.linalg.Vectors
    import org.apache.spark.mllib.linalg.distributed.RowMatrix
    
    val data = Array(
      Vectors.sparse(5, Seq((1, 1.0), (3, 7.0))),
      Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0),
      Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0))
    
    val rows = sc.parallelize(data)
    
    val mat: RowMatrix = new RowMatrix(rows)
    
    // Compute the top 5 singular values and corresponding singular vectors.
    val svd: SingularValueDecomposition[RowMatrix, Matrix] = mat.computeSVD(5, computeU = true)
    val U: RowMatrix = svd.U  // The U factor is a RowMatrix.
    val s: Vector = svd.s     // The singular values are stored in a local dense vector.
    val V: Matrix = svd.V     // The V factor is a local dense matrix.
    

MLlib IndexedRowMatrix API

  • Function

    Input the matrix in the RDD[Vector] form and output its singular value decomposition result.

  • Input and output
    1. Package name: package org.apache.spark.mllib.linalg.distributed
    2. Class name: IndexedRowMatrix
    3. Method name: computeSVD
    4. Input: matrix (RowMatrix)

      Parameter

      Type

      Description

      rows

      RDD[IndexedRow]

      Matrix, which is stored by row

      IndexedRow(index: Long, vector: Vector)

      nRows

      Long

      Number of rows

      nCols

      Int

      Number of columns

    5. Algorithm parameters

      Parameter

      Type

      Default Value

      Description

      k

      Int

      -

      Number of singular values. The value ranges from 1 to n.

      computeU

      Boolean

      false

      Whether to calculate the left singular matrix

      rCond

      Double

      1e-9

      Reciprocal of the number of matrix conditions. If the parameter value exceeds the value of rCond*s[0], the parameter value is considered 0.

      An example is provided as follows:

      1
      2
      3
      4
      val indexedMatrix = new IndexedRowMatrix(trainingData, params.numRows, params.numCols) // Indexed row matrix instance
      
      // Call the computeSVD API of the indexed row matrix.
      val svd = indexedMatrix.computeSVD(params.k, computeU = true)
      
    6. Output: SVD decomposition result SingularValueDecomposition[IndexedRowMatrix, Matrix]

      Parameter

      Type

      Description

      U

      IndexedRowMatrix

      Left singular matrix. The matrix size is m x k.

      s

      Vector

      Singular value vector. The vector length is k.

      V

      Matrix

      Right singular matrix. The matrix size is n x k.

  • Sample usage
    1
    val svdRes = distMatrix.computeSVD(k)