Rate This Document
Findability
Accuracy
Completeness
Readability

SVM

  • Model API

    LinearSVC is an ML classification API.

    Model API Type

    Function API

    ML Classification API

    def fit(dataset: Dataset[_]): LinearSVCModel

    def fit(dataset: Dataset[_], paramMap:ParamMap) : LinearSVCModel

    def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]) : LinearSVCModel

    def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]* ) :LinearSVCModel

  • Input: the model parameters of the fit API paramMap, paramMaps, firstParamPair, otherParamPairs, which are described as follows:

    Param name

    Type(s)

    Example

    Description

    paramMap

    ParamMap

    ParamMap(A.c > b)

    Assigns the value of b to the parameter c of model A.

    paramMaps

    Array[ParamMap]

    Array[ParamMap ](n)

    Generates n ParamMap model parameter lists.

    firstParamPair

    ParamPair

    ParamPair(A.c, b)

    Assigns the value of b to the parameter c of model A.

    otherParamPairs

    ParamPair

    ParamPair(A.e, f)

    Assign the value of f to the parameter e of model A.

ML Classification API

  • Function

    This type of APIs is used to import sample data in dataset format, call the fit API, and output the LinearSVC classification model.

  • Input and output
    1. Package name: package org.apache.spark.ml.classification
    2. Class name: LinearSVC
    3. Method name: fit
    4. Input: training sample data (Dataset[_]). Mandatory fields are as follows:

      Param name

      Type(s)

      Default

      Description

      labelCol

      Double

      "label"

      Predicted label

      featuresCol

      Vector

      "features"

      Feature label

    5. Algorithm parameters

      Algorithm Parameter

      def setRegParam(value: Double): this.type

      def setMaxIter(value: Int): this.type

      def setFitIntercept(value: Boolean): this.type

      def setTol(value: Double): this.type

      def setStandardization(value: Boolean): this.type

      def setWeightCol(value: String): this.type

      def setThreshold(value: Double): this.type

      def setAggregationDepth(value: Int): this.type

    6. Added algorithm parameters

      Parameter

      Description

      Type

      inertiaCoefficient

      Weight of historical direction information in momentum calculation

      Double. The value is a positive real number.

      An example is provided as follows:

      import org.apache.spark.ml.param.{ParamMap, ParamPair}
      
      val svm = new LinearSVC()
      
      // Define the def fit(dataset: Dataset[_], paramMap: ParamMap) API parameter.
      val paramMap = ParamMap(svm.regParam -> regParam)
      .put(svm.maxIter, numIterations)
      
      // Define the def fit(dataset: Dataset[_], paramMaps: Array[ParamMap]): API parameter.
      val paramMaps: Array[ParamMap] = new Array[ParamMap](2)
      for (i <- 0 to  2) {
      paramMaps(i) = ParamMap(svm.regParam -> regParam(i))
      .put(svm.maxIter, numIterations)
      }// Assign a value to paramMaps.
      
      // Define the def fit(dataset: Dataset[_], firstParamPair: ParamPair[_], otherParamPairs: ParamPair[_]*) API parameter.
      val defaultParamPair = ParamPair(svm.regParam, regParam)
      val regParamPair = ParamPair(svm.regParam, regParam)
      val maxIterParamPair = ParamPair(svm.maxIter, numIterations)
      
      // Call the fit APIs.
      model = svm.fit(trainingData)  // Return GBTRegressionModel.
      model = svm.fit(trainingData, paramMap)  // Return GBTRegressionModel.
      The models = svm.fit(trainingData, paramMaps)  // Return Seq[GBTRegressionModel].
      model = svm.fit(trainingData, defaultParamPair, regParamPair, maxIterParamPair) // Return SVMRegressionModel.
    7. Output: LinearSVC classification model (LinearSVCModel). The following are fields output in model prediction:

      Param name

      Type(s)

      Default

      Description

      Notes

      prediction

      Double

      "prediction"

      Predicted label

      -

      label

      Double

      " label "

      label

      Classificatio n only

      features

      Vector

      " features "

      features

      Classificatio n only

  • Sample usage

    fit(dataset: Dataset[_]): LinearSVCModel example:

    test("train"){
    val svc = new LinearSVC().setMaxIter(100).setRegParam(0.1)
    svc.setIc(0.1)
    val msvc = svc.fit(sdf)
    val res = msvc.transform(sdf)
    val accuracy = res.filter($"label"===$"prediction").count().toDouble/res.count
    println("svmx Accuracy = "+accuracy)
    }
  • Sample result
    svmx Accuracy = 0.999406629786391