Overview

You agree to comply with national laws, regulations, and public ethics when using the Kunpeng BoostKit machine learning algorithm library. You shall not use the library to engage in any activities that violate the law, infringe on the rights and interests of others, disrupt social order, undermine social stability, or engage in any activities that endanger or attempt to endanger the computer system and network security.

You acknowledge and confirm that you are responsible for the risks arising from the processing of the machine learning algorithm library. The machine learning algorithm library is provided on an "as-is" basis. To the extent permitted by applicable laws, Huawei does not provide any explicit or implicit guarantee for the machine learning algorithm library, including but not limited to their authenticity, applicability, non-infringement, and security.
You agree that Huawei shall assume no liability for any indirect, incidental, special, or any form of punitive damages, or any loss of profits, revenue, data, or data use.
You acknowledge and agree that you need to download and integrate the open source and third-party software on which the software package of the Kunpeng BoostKit machine learning algorithm library depends. Huawei does not assume any responsibility for the software vulnerabilities and security issues.

Apache Spark is a unified analysis engine used for large-scale data processing. It features scalability and in-memory computing and has become a unified platform for quick processing of lightweight big data. Spark can be used to run applications, such as real-time stream processing, machine learning, and interactive query, on various storage and operating systems. For more information about Spark, see the official Spark documentation.

The Kunpeng BoostKit machine learning algorithm library (referred to as the machine learning algorithm library) is compatible with native Spark APIs (KNN is a Huawei-developed algorithm and does not have native Spark APIs). It has optimized machine learning algorithms, greatly improving the computing performance in big data algorithm scenarios. This library supports the architecture of the Kunpeng processor and its latest version is 1.3.0. The machine learning algorithms provided by the library are as follows:

Gradient boosted decision tree (GBDT)
Random forest (RF)
Support vector machine (SVM)
K-means clustering (K-means)
DecisionTree
LinearRegression
LogisticRegression
Principal component analysis (PCA)
Singular value decomposition (SVD)
Latent Dirichlet allocation (LDA)
Prefix-projected pattern growth (PrefixSpan)
Alternating least squares (ALS)
K-nearest neighbors (KNN)
Covariance
Density-based Spatial Clustering of Applications with Noise (DBSCAN)
Pearson correlation coefficient (Pearson)
Spearman's rank correlation coefficient (Spearman)
Extreme gradient boosting (XGBoost)

Parent topic: Introduction