Rate This Document
Findability
Accuracy
Completeness
Readability

Overview

  • You agree to comply with national laws, regulations, and public ethics when using the machine learning algorithm library. You shall not use the library to engage in any activities that violate the law, infringe on the rights and interests of others, disrupt social order, undermine social stability, or engage in any activities that endanger or attempt to endanger the computer system and network security.
  • You acknowledge and confirm that you are responsible for the risks arising from the processing of the machine learning algorithm library. The machine learning algorithm library is provided on an "as-is" basis. To the extent permitted by applicable laws, Huawei does not provide any explicit or implicit guarantee for the machine learning algorithm library, including but not limited to their authenticity, applicability, non-infringement, and security.
  • You agree that Huawei shall assume no liability for any indirect, incidental, special, or any form of punitive damages, or any loss of profits, revenue, data, or data use.

Apache Spark is a unified analysis engine used for large-scale data processing. It features scalability and in-memory computing and has become a unified platform for quick processing of lightweight big data. Spark can be used to run applications, such as real-time stream processing, machine learning, and interactive query, on various storage and operating systems. For more information about Spark, see the official Spark documentation.

The machine learning algorithm library is compatible with native Spark APIs. It has optimized machine learning algorithms, greatly improving the computing performance in big data algorithm scenarios. This library supports the Kunpeng processor architecture and its latest version is 1.2.0. It provides the following machine learning algorithms:

  • Gradient boosted decision tree (GBDT)
  • Random forest (RF)
  • Support vector machine (SVM)
  • K-means clustering (K-means)
  • DecisionTree
  • LinearRegression
  • LogisticRegression
  • Principal component analysis (PCA)
  • Singular value decomposition (SVD)
  • Latent Dirichlet Allocation (LDA)
  • Prefix-projected pattern growth (PrefixSpan)
  • Alternating least squares (ALS)
  • K-nearest neighbors (KNN)