Rate This Document
Findability
Accuracy
Completeness
Readability

Pearson

Model API Type

Function API

ML API

def corr(dataset: Dataset[_], column: String): DataFrame

def corr(dataset: Dataset[_],column: String, method: String): DataFrame

ML API

  • Function description

    Output the clustering result after you input sample data in the dataset format and call the fitPredict API.

  • Input and output
    1. Package name: org.apache.spark.ml.stat
    2. Class name: Correlation
    3. Method name: corr
    4. Input: training sample data (Dataset[_]). The following are mandatory fields.

      Parameter

      Value Type

      Description

      data

      Dataset[Vector]

      Matrix, which is stored by row

      column

      String

      Specifies columns for correlation matrix calculation.

      method

      String

      Matrix method. The value can be spearman or pearson (default).

    5. Algorithm parameters

      Parameter

      Value Type

      Default Value

      Description

      method

      String

      pearson

      Method for solving the correlation matrix. The default value is pearson.

      Code API example:

      1
      val mat = stat.Correlation.corr(data, "matrix")
      
    6. Output: Pearson correlation matrix

      Parameter

      Value Type

      Description

      df

      DataFrame

      Pearson correlation matrix. The column name is column + method.

  • Example
    val mat = stat.Correlation.corr(data, "matrix")
    val mat = stat.Correlation.corr(data, "matrix", "Pearson")