Logistic Regression

The Logistic Regression algorithm provides ML classification APIs.

Model API Type	Function API
ML classification API	def fit(dataset: Dataset[_]):LogisticRegressionModel
	def fit(dataset: Dataset[_], paramMap: ParamMap): LogisticRegressionModel
	def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*):LogisticRegressionModel
	def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): Seq[LogisticRegressionModel]

ML Classification API

Function description
Output the Logistic Regression model after you input sample data in dataset format and call the fit API.

Input and output

Package name: package org.apache.spark.ml.classification
Class name: LogisticRegression
Method name: fit
Input: training sample data (Dataset[_]). The following are mandatory fields.
Parameter

Value Type

Default Value

Description
labelCol

Double

label

Label. Requirements are as follows:

label == label.toInt
label >= 0
featuresCol

Vector

features

Feature label

Parameter	Value Type	Default Value	Description
labelCol	Double	label	Label. Requirements are as follows: label == label.toInt label >= 0
featuresCol	Vector	features	Feature label

Parameters optimized based on native algorithms

def setRegParam(value: Double): LogisticRegression.this.type
def setElasticNetParam(value: Double): LogisticRegression.this.type
def setMaxIter(value: Int): LogisticRegression.this.type
def setTol(value: Double): LogisticRegression.this.type
def setFitIntercept(value: Boolean): LogisticRegression.this.type
def setFamily(value: String): LogisticRegression.this.type
def setStandardization(value: Boolean): LogisticRegression.this.type
override def setThreshold(value: Double): LogisticRegression.this.type
def setWeightCol(value: String): LogisticRegression.this.type
override def setThresholds(value: Array[Double]): LogisticRegression.this.type
def setAggregationDepth(value: Int): LogisticRegression.this.type
def setLowerBoundsOnCoefficients(value: Matrix): LogisticRegression.this.type
def setUpperBoundsOnCoefficients(value: Matrix): LogisticRegression.this.type
def setLowerBoundsOnIntercepts(value: Vector): LogisticRegression.this.type
def setUpperBoundsOnIntercepts(value: Vector): LogisticRegression.this.type

An example is provided as follows:

import org.apache.spark.ml.param.{ParamMap, ParamPair}

val logR = new LogisticRegression()
// Define the def fit(dataset: Dataset[_], paramMap: ParamMap) API parameter.
val paramMap = ParamMap(logR.maxIter -> maxIter)
.put(logR.regParam, regParam)

// Define the def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): API parameter.
val paramMaps: Array[ParamMap] = new Array[ParamMap](2)
for (i <- 0 to  2) {
paramMaps(i) = ParamMap(logR.maxIter -> maxIter)
.put(logR.regParam, regParam)
}//Assign a value to paramMaps.

// Define the def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*) API parameter.
val regParamPair = ParamPair(logR.regParam, regParam)
val maxIterParamPair = ParamPair(logR.maxIter, maxIter)
val tolParamPair = ParamPair(logR.tol, tol)

// Call the fit APIs.
model = logR.fit(trainingData)
model = logR.fit(trainingData, paramMap)
models = logR.fit(trainingData, paramMaps)
model = logR.fit(trainingData, regParamPair, maxIterParamPair, tolParamPair)

Output: LogisticRegressionModel. The following table lists the field output in model prediction.
Parameter

Value Type

Default Value

Description

predictionCol

Double

prediction

Predicted label

Parameter	Value Type	Default Value	Description
predictionCol	Double	prediction	Predicted label

Example

import org.apache.spark.ml.classification.LogisticRegression

// Load training data
val training = spark.read.format("libsvm").load("data/mllib/sample_libsvm_data.txt")

val lr = new LogisticRegression()
  .setMaxIter(10)
  .setRegParam(0.3)
  .setElasticNetParam(0.8)

// Fit the model
val lrModel = lr.fit(training)

// Print the coefficients and intercept for logistic regression
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")

// We can also use the multinomial family for binary classification
val mlr = new LogisticRegression()
  .setMaxIter(10)
  .setRegParam(0.3)
  .setElasticNetParam(0.8)
  .setFamily("multinomial")

val mlrModel = mlr.fit(training)

Parent topic: Classification and Regression