我要评分
获取效率
正确性
完整性
易理解

What's New

The following tables describe the latest updates in documents of Kunpeng BoostKit for SRA. New features are released after being verified.

December 2025

No.

Update

Description

Document

1

Updated Kunpeng BoostKit for SRA Technical White Paper.

Added the Kunpeng retrieval library, KNewPfordelta, and TensorFlow Serving ANNC features.

Kunpeng BoostKit for SRA Technical White Paper

2

Updated the Kunpeng recall algorithm library document.

  • Released the KScaNN and KBest code open source.
  • Added the Elasticsearch, hnswlib, and hnswlib algorithms, and open-sourced the code.

Kunpeng BoostKit for SRA Kunpeng Recall Algorithm Library Feature Documentation

3

Updated the Kunpeng Retrieval Library documentation.

Added support for the Kunpeng 920 7592C processor.

Kunpeng BoostKit for SRA Kunpeng Retrieval Library Feature Documentation

4

Updated the Kunpeng artificial intelligence library document.

  • Changed the Kunpeng AI library name to Kunpeng AI Operator Library (KDNN).
  • Renamed KAIL_DNN sublibrary KDNN.
  • Renamed KAIL_DNN_EXT sublibrary KDNN_EXT.
  • Added the NEON implementation of MatMul.
  • Added the custom thread pool mode for MatMul.
  • Added support for Group Normalization and SparseGemm deep neural network operators on the Kunpeng platform.

Kunpeng BoostKit for SRA Kunpeng Artificial Intelligence Library Feature Documentation

5

Updated the Search and Recommendation Ranking Model Inference Benchmark Test Guide.

  • Added the procedure for using TensorFlow Serving as the inference server.
  • Added the procedure for enabling system-level optimization, ANNC graph compilation optimization, and KDNN operator library optimization on Kunpeng servers to obtain the optimal inference performance.

Search and Recommendation Ranking Model Inference Benchmark Test Guide

September 2025

No.

Update

Description

Document

1

Updated Kunpeng BoostKit for SRA Technical White Paper.

Added the Kunpeng retrieval library, KNewPfordelta, and TensorFlow Serving ANNC features.

Kunpeng BoostKit for SRA Technical White Paper

2

Updated the Kunpeng recall algorithm library document.

  • The KVecTurbo sub-library code is open-sourced.
  • Added the KNewPfordelta sub-library. The code is open-sourced.

Kunpeng BoostKit for SRA Kunpeng Recall Algorithm Library Feature Documentation

3

Added the Kunpeng retrieval library.

This library is optimized for the Kunpeng platform to accelerate vector retrieval. It optimizes the instruction set architecture and memory access mechanism of the Kunpeng processor at the bottom layer. By combining low-precision quantization with high-precision reranking, the library significantly improves the computational efficiency and throughput of recall algorithms without compromising accuracy. These optimizations make it suitable for high-concurrency recall scenarios.

Kunpeng BoostKit for SRA Kunpeng Retrieval Library Feature Documentation

4

Added the TensorFlow Serving ANNC feature.

An extended acceleration suite. It is built on open source OpenXLA, and hosted in the ANNC repository maintained by the openEuler community. The suite includes optimizations tailored for the Kunpeng platform, such as TensorFlow graph fusion, Accelerated Linear Algebra (XLA) graph fusion, and operator optimization.

Kunpeng BoostKit for SRA TensorFlow Serving ANNC Feature Document

June 2025

No.

Update

Description

Document

1

Updated Kunpeng BoostKit for SRA Technical White Paper.

Updated KScaNN, KBest, and their connection to a vector database, and added TensorFlow Serving thread scheduling optimization.

Kunpeng BoostKit for SRA Technical White Paper

2

Updated the Kunpeng recall algorithm library document.

  • Added the SetEarlyStoppingParams API, Add API, and KBest constructor to the KBest library.
  • Connected the KScaNN library to Milvus.

Kunpeng BoostKit for SRA Kunpeng Recall Algorithm Library Feature Documentation

3

Updated the Kunpeng inference acceleration kit document.

Added the efficient, Kunpeng-developed ONNX Runtime library (Kunpeng ONNX Runtime, KONNX).

Kunpeng BoostKit for SRA Kunpeng Inference Acceleration Kit Feature Documentation

4

Updated the Kunpeng artificial intelligence library document.

  • Added support in the KAIL_DNN library for deep neural network operators on the Kunpeng platform, including Pool, Batch Normalization, Local Response Normalization, Reduction, PReLU, Binary, and RNN.
  • Added support in the KAIL_DNN library for Kunpeng 920 72F8 processors.

Kunpeng BoostKit for SRA Kunpeng Artificial Intelligence Library Feature Documentation

5

Added the TensorFlow Serving thread scheduling optimization document.

Kunpeng BoostKit developed a thread scheduling optimization solution to enhance TF Serving inference performance.

Kunpeng BoostKit for SRA TensorFlow Serving Thread Scheduling Optimization Feature Documentation

6

Added the ONNX Runtime Porting Guide.

This document describes how to install, compile, and verify ONNX Runtime 1.19.2 on openEuler 22.03 LTS SP3 running on the Kunpeng 920 series processor.

ONNX Runtime Porting Guide

March 2025

No.

Update

Description

Document

1

Updated the Kunpeng recall algorithm library document.

  • Added the SaveGraph, LoadGraph, Serialize, Deserialize, BuildSearcher, GetNTotal, and GetDim APIs to the KBest library, and modified the Add, Save, and Load APIs. Currently, the APIs have been connected to Milvus.
  • Added KVecturbo, a self-developed vector retrieval acceleration component of the Kunpeng recall algorithm library. KVecturbo is connected to the openGauss vector database.

Kunpeng BoostKit for SRA Kunpeng Recall Algorithm Library Feature Documentation

2

Added the TVM Porting Guide.

This document describes how to compile and install TVM 0.9.0 and optimize performance on the openEuler 22.03 LTS SP3 OS based on the Kunpeng processor.

TVM Porting Guide

3

Added the Search and Recommendation Ranking Model Inference Benchmark Test Guide.

Based on the Kunpeng processor, this document describes the entire process of deploying the search and recommendation ranking models of ModelZoo on the openEuler 22.03 LTS SP3 OS for inference performance tests. The process includes setting up the test environment for the server and client and testing the performance in the inference phase.

Search and Recommendation Ranking Model Inference Benchmark Test Guide

4

Added the Kunpeng BoostKit for SRA Technical White Paper.

This document describes the solution, architecture, specifications, application scenarios, and typical configurations of Kunpeng BoostKit for SRA.

Kunpeng BoostKit for SRA Technical White Paper

December 2024

No.

Update

Description

Document

1

Added the Kunpeng recall algorithm library.

  • Added KBest, a proprietary, efficient graph search algorithm.
  • Added KScaNN, a proprietary algorithm with Kunpeng affinity, which optimizes the index layout and algorithm process based on the Kunpeng architecture.

Kunpeng BoostKit for SRA Kunpeng Recall Algorithm Library Feature Documentation

2

Added SRA_Inference, the Kunpeng inference acceleration kit.

Added the Kunpeng Inference Acceleration Kit Developer Guide and its release notes, providing SRA_Inference version mapping description, installation guide, API description, and sample code to help you quickly get started with it.

Kunpeng BoostKit for SRA Kunpeng Inference Acceleration Kit Feature Documentation

3

Added the Kunpeng inference AI operator library.

  • Added support for deep neural network operators on the Kunpeng platform.
  • Added support for the random_choice and softmax operators on the Kunpeng platform.

Kunpeng BoostKit for SRA Kunpeng Artificial Intelligence Library Feature Documentation

4

Added the TensorFlow Serving inference deployment test guide.

Based on the Kunpeng processor, this document introduces how to port the inference deployment framework TensorFlow Serving in openEuler 22.03 LTS SP3 and deploy test models for stress tests.

TensorFlow Serving Inference Deployment Test Guide

June 2024

No.

Update

Description

Document

1

Added a porting guide.

Added a porting guide for the deep learning recommendation model (DLRM). The guide describes how to train, run, and verify DLRM based on the Kunpeng processor.

DLRM Porting Guide

March 2024

No.

Update

Description

Document

1

Added the porting guides for open source components.

Added the porting guides for open source components: oneDNN, PyTorch, TensorFlow, and ScaNN. The guides describe how to install, configure, and verify those components based on the Kunpeng processor.