Rate This Document

Findability

Accuracy

Completeness

Readability

What's New

The following tables describe the latest updates in documents of Kunpeng BoostKit for SRA. New features are released after being verified.

June 2026

No.	Update	Description	Document
1	Updated the Kunpeng BoostKit for SRA Technical White Paper.	Added the Kunpeng DiskANN, Embedding Lookup, and Kunpeng TensorRT-LLM features. Added SVE2 interfaces to KRL. Added the constant folding optimization feature to ANNC. Added the fused operator feature to TensorFlow.	Kunpeng BoostKit for SRA Technical White Paper
2	Added the Elasticsearch Porting Guide.	Split from the original Kunpeng Recall Algorithm Library Developer Guide. Describes how to install, compile, and verify Elasticsearch based on Kunpeng processors.	Elasticsearch Porting Guide
3	Updated the ONNX Runtime Porting Guide.	Modified the matrix partitioning method for matrix computation in ONNX Runtime.	ONNX Runtime Porting Guide

May 2026

No.	Update	Description	Document
1	Updated the Kunpeng BoostKit for SRA Technical White Paper.	Added the RaBitQ feature. Added the FP16 support for the HNSW algorithm on Faiss and optimized IVFPQ. Added KDNN-specific top 5+ operators, which can be used together with some TensorFlow matrix operators.	Kunpeng BoostKit for SRA Technical White Paper
2	Document Migration	Migrated the KAIL documents from the original acceleration feature section to the documentation site of BoostCore basic acceleration. Migrated the feature documents of Kunpeng Recall Algorithm Library, Kunpeng Inference Acceleration Kit, Kunpeng Retrieval Library, TensorFlow Serving thread scheduling optimization, and TensorFlow Serving ANNC from the original acceleration feature section to the BoostSRA documentation site.	KAIL documents BoostKit BoostSRA Before You Start
3	Added the BoostCore basic acceleration section and the Boost-X application acceleration section.	Added the BoostCore basic acceleration section and referenced it to the KAIL documentation site. Added the Boost-X application acceleration section and referenced it to the BoostSRA documentation site.	KAIL documents BoostKit BoostSRA Before You Start

December 2025

No.	Update	Description	Document
1	Updated the Kunpeng BoostKit for SRA Technical White Paper.	Added the Kunpeng retrieval library, KNewPfordelta, and TensorFlow Serving ANNC features.	Kunpeng BoostKit for SRA Technical White Paper
2	Updated the Kunpeng recall algorithm library document.	Released the open-source code for KScaNN and KBest. Added support for Kunpeng 950 processors for KScaNN and KBest. Added the Elasticsearch and hnswlib algorithms, and open-sourced the code.	Kunpeng Recall Algorithm Library Feature Documentation
3	Updated the Kunpeng retrieval library document.	Updated and released the software package. Added support for the Kunpeng 950 processor.	Kunpeng Retrieval Library Feature Documentation
4	Updated the Kunpeng artificial intelligence library document.	Renamed KAIL_DNN to KDNN. Renamed KAIL_DNN_EXT to KDNN_EXT. Added the NEON implementation of MatMul. Added the custom thread pool mode for MatMul. Added support for Group Normalization and SparseGemm deep neural network operators on the Kunpeng platform.	Kunpeng Artificial Intelligence Library Feature Documentation
5	Updated the Search and Recommendation Ranking Model Inference Benchmark Test Guide.	Added support for the Kunpeng 950 processor. Added the procedure for using TensorFlow Serving as the inference server. Added the procedure for enabling system-level optimization, ANNC graph compilation optimization, and KDNN operator library optimization on Kunpeng servers to obtain the optimal inference performance.	Search and Recommendation Ranking Model Inference Benchmark Test Guide

September 2025

No.	Update	Description	Document
1	Updated the Kunpeng BoostKit for SRA Technical White Paper.	Added the Kunpeng retrieval library, KNewPfordelta, and TensorFlow Serving ANNC features.	Kunpeng BoostKit for SRA Technical White Paper
2	Updated the Kunpeng recall algorithm library document.	The KVecTurbo sub-library code is open-sourced. Added the KNewPfordelta sub-library. The code is open-sourced.	Kunpeng Recall Algorithm Library Feature Documentation
3	Added the Kunpeng retrieval library.	This library is optimized for the Kunpeng platform to accelerate vector retrieval. It optimizes the instruction set architecture and memory access mechanism of the Kunpeng processor at the bottom layer. By combining low-precision quantization with high-precision reranking, the library significantly improves the computational efficiency and throughput of recall algorithms without compromising accuracy. These optimizations make it suitable for high-concurrency recall scenarios.	Kunpeng Retrieval Library Feature Documentation
4	Added the TensorFlow Serving ANNC feature.	An extended acceleration suite. It is built on open-source OpenXLA, and hosted in the ANNC repository maintained by the openEuler community. The suite includes optimizations tailored for the Kunpeng platform, such as TensorFlow graph fusion, Accelerated Linear Algebra (XLA) graph fusion, and operator optimization.	TensorFlow Serving ANNC Feature Document

June 2025

No.	Update	Description	Document
1	Updated Kunpeng BoostKit for SRA Technical White Paper.	Updated KScaNN, KBest, and their connection to a vector database, and added TensorFlow Serving thread scheduling optimization.	Kunpeng BoostKit for SRA Technical White Paper
2	Updated the Kunpeng recall algorithm library document.	Added the SetEarlyStoppingParams API, Add API, and KBest constructor to the KBest library. Connected the KScaNN library to Milvus.	Kunpeng Recall Algorithm Library Feature Documentation
3	Updated the Kunpeng inference acceleration kit document.	Added the proprietary, efficient ONNX Runtime library (Kunpeng ONNX Runtime, KONNX).	Kunpeng Inference Acceleration Kit Feature Documentation
4	Updated the Kunpeng artificial intelligence library document.	Added support in the KAIL_DNN library for deep neural network operators on the Kunpeng platform, including Pool, Batch Normalization, Local Response Normalization, Reduction, PReLU, Binary, and RNN. Added support in the KAIL_DNN library for the new Kunpeng 920 processor model.	Kunpeng Artificial Intelligence Library Feature Documentation
5	Added the TensorFlow Serving thread scheduling optimization feature.	Kunpeng BoostKit developed a thread scheduling optimization solution to enhance TF Serving inference performance.	TensorFlow Serving Thread Scheduling Optimization Feature Documentation
6	Added the ONNX Runtime Porting Guide.	This document describes how to install, compile, and verify ONNX Runtime 1.19.2 on openEuler 22.03 LTS SP3 running on the Kunpeng 920 series processor.	ONNX Runtime Porting Guide

March 2025

No.	Update	Description	Document
1	Updated the Kunpeng recall algorithm library document.	Added the SaveGraph, LoadGraph, Serialize, Deserialize, BuildSearcher, GetNTotal, and GetDim APIs to the KBest library, and modified the Add, Save, and Load APIs. Currently, the APIs have been connected to Milvus. Added KVecturbo, a proprietary vector retrieval acceleration component of the Kunpeng recall algorithm library. KVecturbo works with the openGauss vector database.	Kunpeng Recall Algorithm Library Feature Documentation
2	Added the TVM Porting Guide.	This document describes how to compile and install TVM 0.9.0 and optimize performance on the openEuler 22.03 LTS SP3 OS based on the Kunpeng processor.	TVM Porting Guide
3	Added the Search and Recommendation Ranking Model Inference Benchmark Test Guide.	Based on the Kunpeng processor, this document describes the entire process of deploying the search and recommendation ranking models of ModelZoo on the openEuler 22.03 LTS SP3 OS for inference performance tests. The process includes setting up the test environment for the server and client and testing the performance in the inference phase.	Search and Recommendation Ranking Model Inference Benchmark Test Guide
4	Added Kunpeng BoostKit for SRA Technical White Paper.	This document describes the solution, architecture, specifications, application scenarios, and typical configurations of Kunpeng BoostKit for SRA.	Kunpeng BoostKit for SRA Technical White Paper

December 2024

No.	Update	Description	Document
1	Added the Kunpeng recall algorithm library.	Added KBest, a proprietary, efficient graph search algorithm. Added KScaNN, a proprietary algorithm with Kunpeng affinity, which optimizes the index layout and algorithm process based on the Kunpeng architecture.	Kunpeng Recall Algorithm Library Feature Documentation
2	Added SRA_Inference, the Kunpeng inference acceleration kit.	Added the Kunpeng Inference Acceleration Kit Developer Guide and its release notes, providing SRA_Inference version mapping description, installation guide, API description, and sample code to help you quickly get started with it.	Kunpeng Inference Acceleration Kit Feature Documentation
3	Added the Kunpeng inference AI operator library.	Added support for deep neural network operators on the Kunpeng platform. Added support for the random_choice and softmax operators on the Kunpeng platform.	Kunpeng Artificial Intelligence Library Feature Documentation
4	Added the TensorFlow Serving Inference Deployment Test Guide.	Based on the Kunpeng processor, this document introduces how to port the inference deployment framework TensorFlow Serving in openEuler 22.03 LTS SP3 and deploy test models for stress tests.	TensorFlow Serving Inference Deployment Test Guide

June 2024

No.	Update	Description	Document
1	Added a porting guide.	Added a porting guide for the deep learning recommendation model (DLRM). The guide describes how to train, run, and verify DLRM based on the Kunpeng processor.	DLRM Porting Guide

March 2024

No.	Update	Description	Document
1	Added the porting guides for open-source components.	Added the porting guides for open-source components: oneDNN, PyTorch, TensorFlow, and ScaNN. The guides describe how to install, configure, and verify those components based on the Kunpeng processor.	oneDNN Porting Guide PyTorch Porting Guide TensorFlow Porting Guide ScaNN Porting Guide