我要评分
获取效率
正确性
完整性
易理解

Pearson

Model API Type

Function API

ML API

def corr(dataset: Dataset[_], column: String): DataFrame

def corr(dataset: Dataset[_],column: String, method: String): DataFrame

ML API

  • Function

    Import sample data in the dataset format, call the fitPredict API, and output the clustering result.

  • Input/Output
    1. Package name: org.apache.spark.ml.stat
    2. Class name: Correlation
    3. Method name: corr
    4. Input: training sample data (Dataset[_]). The following are mandatory fields.

      Parameter

      Type

      Description

      data

      Dataset[Vector]

      Matrix, which is stored in the unit of row

      column

      String

      Specifies columns for correlation matrix calculation.

      method

      String

      Matrix method. The value can be spearman or pearson (default).

    5. Algorithm parameters

      Parameter

      Type

      Default Value

      Description

      method

      String

      pearson

      Method for solving the correlation matrix. The default value is pearson.

      Code interface example:

      1
      val mat = stat.Correlation.corr(data, "matrix")
      
    6. Output: Pearson correlation matrix

      Parameter

      Type

      Description

      df

      DataFrame

      Spearman matrix. The column name is column + method.

  • Sample usage
    val mat = stat.Correlation.corr(data, "matrix")
    val mat = stat.Correlation.corr(data, "matrix", "Pearson")