Pearson
Model API Type |
Function API |
|---|---|
ML API |
def corr(dataset: Dataset[_], column: String): DataFrame def corr(dataset: Dataset[_],column: String, method: String): DataFrame |
ML API
- Input/Output
- Package name: org.apache.spark.ml.stat
- Class name: Correlation
- Method name: corr
- Input: training sample data (Dataset[_]). The following are mandatory fields.
Parameter
Type
Description
data
Dataset[Vector]
Matrix, which is stored in the unit of row
column
String
Specifies columns for correlation matrix calculation.
method
String
Matrix method. The value can be spearman or pearson (default).
- Algorithm parameters
Parameter
Type
Default Value
Description
method
String
pearson
Method for solving the correlation matrix. The default value is pearson.
Code interface example:
1val mat = stat.Correlation.corr(data, "matrix")
- Output: Pearson correlation matrix
Parameter
Type
Description
df
DataFrame
Spearman matrix. The column name is column + method.
- Sample usage
val mat = stat.Correlation.corr(data, "matrix") val mat = stat.Correlation.corr(data, "matrix", "Pearson")
Parent topic: Developing an Application