Setting Up the Development Environment

Client Environment

Table 1 describes the client environment requirements.

**Table 1** Client environment requirements
Item	Version	Remarks
OS	Windows 7 or later	Prepare it in advance.
Installing JDK	OpenJDK 1.8	See Creating a Project.
Installing and configuring the development tool	Eclipse or IntelliJ IDEA is recommended. This document uses IntelliJ IDEA (2018.2) as an example.	Prepare it in advance.
Installing Scala	Complete the basic configuration for the Scala environment. For Spark 2.3.2 and Spark 2.4.6, the recommended Scala version is 2.11.8.	See Creating a Project.
Installing Maven	Compile the project package. Recommended version: 3.6.3.	See Creating a Project.

Obtaining the Software

Table 2 describes how to obtain the library package of a machine learning algorithm.

**Table 2** How to obtain the library packages
Applicable Spark Version	Software Package and URL	Remarks
Spark 2.3.2/2.4.5/2.4.6/3.1.1	Huawei technical support websites Enterprise website Carrier website	NA
Spark 2.3.2	boostkit-ml-acc_2.11-2.2.0-spark2.3.2.jar boostkit-ml-core_2.11-2.2.0-spark2.3.2.jar boostkit-ml-kernel-client_2.11-2.2.0-spark2.3.2.jar	For details about how to compile the packages, see Compiling the Code in the Big Data Machine Learning Algorithm Library Feature Guide. boostkit-ml-acc_2.XX-XXX-sparkXX.jar It is required for software running and must be deployed. boostkit-ml-core_2.XX-XXX-sparkXX.jar It is required for software running and must be deployed. boostkit-ml-kernel-client_2.XX-XXX-sparkXX.jar It is required for software compilation and does not need to be deployed. boostkit-xgboost4j_XXX.jar Adaptation package required by the XGBoost algorithm, which can be compiled from the open source adaptation code. It is required for software running and must be deployed.
Spark 2.3.2	boostkit-xgboost4j_2.11-2.2.0.jar boostkit-xgboost4j-spark2.3.2_2.11-2.2.0.jar
Spark 2.4.5/2.4.6	boostkit-ml-acc_2.11-2.2.0-spark2.4.6.jar boostkit-ml-core_2.11-2.2.0-spark2.4.6.jar boostkit-ml-kernel-client_2.11-2.2.0-spark2.4.6.jar
Spark 2.4.5/2.4.6	boostkit-xgboost4j_2.11-2.2.0.jar boostkit-xgboost4j-spark2.4.6_2.11-2.2.0.jar
Spark 3.1.1	boostkit-ml-acc_2.12-2.2.0-spark3.1.1.jar boostkit-ml-core_2.12-2.2.0-spark3.1.1.jar boostkit-ml-kernel-client_2.12-2.2.0-spark3.1.1.jar

After obtaining the BoostKit-ml_2.2.0.zip software package, verify that it is consistent with that provided on the website.

Verify the software package as follows:

Obtain the digital certificate and software.
Obtain the verification tool and method from the following link:
https://support.huawei.com/enterprise/en/tool/pgp-verify-TL1000000054
Verify the software package integrity by following the procedure described in the OpenPGP Signature Verification Guide obtained from the URL.

Cluster Environment

Prepare the required cluster environment before algorithm development. Table 3 lists the required software versions.

**Table 3** Cluster environment requirements
Item	Requirement
OS	openEuler-20.03-LTS-SP1
JDK	BiSheng JDK 1.8.0_262
ZooKeeper	3.4.9
Hadoop	3.1.1
Spark	Apache Spark 2.3.2, 2.4.5, 2.4.6, or 3.1.1

The Kunpeng algorithm library is compatible with Apache Spark 2.3.2, 2.4.5, 2.4.6, and 3.1.1. Other platforms are not verified. For security purposes, you are advised to use a later version.

Parent topic: Example Projects