Rate This Document
Findability
Accuracy
Completeness
Readability

Obtaining Code

  • To run machine learning algorithms, you need to obtain the core JAR files and adaptation JAR files of the algorithm library. The adaptation JAR files can be compiled from the adaptation code of the library or directly obtained. Table 1 shows how to obtain the JAR files and code.
  • To run algorithms other than XGBoost, you need to deploy boostkit-ml-acc_2.11-1.3.0-spark2.3.2.jar, boostkit-ml-core_2.11-1.3.0-spark2.3.2.jar, and boostkit-ml-kernel_2.11-1.3.0-spark2.3.2-aarch64.jar.
  • To run the XGBoost algorithm, you need to deploy libboostkit_xgboost_kernel.so, boostkit-xgboost4j-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar, boostkit-xgboost4j_2.11-1.3.0.jar, and boostkit-xgboost4j-spark2.3.2_2.11-1.3.0.jar. boostkit-ml-kernel-client_2.11-1.3.0-spark2.3.2.jar does not need to be deployed in the Spark cluster. It provides dependency for compilation only in the development phase.

Obtaining the Adaptation Code Spark-ml-algo-lib of the Machine Learning Algorithm Library

The adaptation code is developed based on the open source software Spark 2.3.2 and 2.4.6 and is used for compiling the machine learning algorithm library.

Download the open source repository code that adapts to Spark 2.3.2 or Spark 2.4.6 for the algorithm library to a specified directory, for example, /opt/, and decompress the package. (The following uses the package that adapts to Spark 2.3.2 as an example.)

1
2
cd /opt/
unzip Spark-ml-algo-lib-1.3.0-spark2.3.2.zip

The adaptation code of the machine learning algorithm library is built by incorporating some native code files of Spark 2.3.2, Breeze 0.13.1, and XGBoost 1.1.0 into the patch. For details about how to build the code, see References.

The machine learning algorithm library provides compiled adaptation packages. For details about how to compile the packages, see Compiling Code. If you have obtained the required package, skip Compiling Code and directly install and deploy the software. After obtaining the package, save it to a specified directory, for example, /opt/. boostkit-ml-kernel-client_2.11-1.3.0-spark2.3.2.jar and boostkit-ml-kernel-client_2.11-1.3.0-spark2.4.6.jar are dependent libraries for application development and do not need to be deployed in the Spark cluster. They are used only for compilation in the development phase. Table 1 shows how to obtain the packages.

Obtaining the Core JAR File of the Machine Learning Algorithm Library

BoostKit-ml_1.3.0.zip can be obtained from Huawei support website (see Obtaining the Software). You can decompress the package to obtain boostkit-ml-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar, boostkit-xgboost4j-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar, and libboostkit_xgboost_kernel.so and save them to the /opt/ directory.

  1. Decompress BoostKit-ml_1.3.0.zip.
    1
    2
    cd /opt/
    unzip BoostKit-ml_1.3.0.zip
    
  2. Copy boostkit-ml-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar, boostkit-xgboost4j-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar, and libboostkit_xgboost_kernel.so to the /opt/ directory.
    1
    2
    3
    4
    cd BoostKit-ml_1.3.0
    cp boostkit-ml-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar /opt/
    cp boostkit-xgboost4j-kernel-2.11-1.3.0-spark2.3.2-aarch64.jar /opt/
    cp libboostkit_xgboost_kernel.so /opt/
    

The downloaded software package must be verified to ensure that it is the same as the one on the website. The verification method is as follows:

  1. Obtain the digital certificate and software.

    The software package of the current version is restricted for commercial use. You need to submit an application and wait for approval before downloading the software package.

  2. Obtain the verification tool and method from the following link:
  3. Verify the software package integrity by following the procedure described in the OpenPGP Signature Verification Guide obtained in 2.

This document uses the BoostKit algorithm package based on Spark 2.3.2 as an example and also applies to the BoostKit algorithm package based on Spark 2.4.6.