我要评分
获取效率
正确性
完整性
易理解

SPCA

The SPCA algorithm provides ML SPCA APIs and MLlib SPCA APIs.

Model API Type

Function API

ML SPCA API

def fit(dataset: Dataset[_]): PCAModel

def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[PCAModel]

def fit(dataset: Dataset[_], paramMap: ParamMap): PCAModel

def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*): PCAModel

MLlib SPCA API

def fit(sources: RDD[Vector]): PCAModel

ML SPCA API

  • Function description

    Input a matrix in the dataset form and output its principal components and corresponding weights.

  • Input and output
    1. Package name: package org.apache.spark.ml.feature
    2. Class name: SPCA
    3. Method name: fit
    4. Input: matrix (Dataset[_]) and the number of principal components

      Parameter

      Value Type

      Description

      dataset

      Dataset[Vector]

      Matrix, which is stored by row

      k

      Int

      Number of principal components

    5. Algorithm parameters

      Parameter

      Value Type

      Default Value

      Description

      setk(value:Int)

      k

      -

      Number of required principal components. The value range is [1, n].

      An example is provided as follows:

      1
      val pcaModel = new SPCA().setK(k).setInputCol("matrix").fit(data)
      
    6. Output: SPCAModel, including the principal components and the corresponding weights

      Parameter

      Value Type

      Description

      pc

      DenseMatrix

      Principal component matrix. Each column is a principal component vector.

      explainedVariance

      DenseVector

      Weights of the principal components. Each dimension corresponds to a principal component.

  • Example
    1
    val pcaModel = new SPCA().setK(k).setInputCol("matrix").fit(data)
    

MLlib SPCA API

  • Function description

    Input a matrix in the RDD[Vector] form and output its principal components and corresponding weights.

  • Input and output
    1. Package name: package org.apache.spark.mllib.feature
    2. Class name: SPCA
    3. Method name: fit
    4. Input: matrix RDD[Vector] and the number of principal components

      Parameter

      Value Type

      Description

      sources

      RDD[Vector]

      Matrix, which is stored by row

      k

      Int

      Number of principal components

    5. Algorithm parameters

      Parameter

      Value Type

      Default Value

      Description

      setk(value:Int)

      k

      -

      Number of required principal components. The value range is [1, n].

      An example is provided as follows:

      1
      val pcaModel = new SPCA(k).fit(data)
      
    6. Output: PCAModel, including the principal components and the corresponding weights

      Parameter

      Value Type

      Description

      pc

      DenseMatrix

      Principal component matrix. Each column is a principal component vector.

      explainedVariance

      DenseVector

      Weights of the principal components. Each dimension corresponds to a principal component.

  • Example
    1
    val pcaModel = new SPCA(k).fit(data)