我要评分
获取效率
正确性
完整性
易理解

Solution Architecture

Figure 1 shows the positioning of Kunpeng BoostKit for SRA components. The acceleration solution and modification policy of each component in Kunpeng BoostKit for SRA are different. Generally, the system architecture of the original base software is not modified. However, the modification policy varies according to the actual requirements. Table 1 describes each component.

Figure 1 Architecture of Kunpeng BoostKit for SRA
Table 1 Kunpeng BoostKit for SRA components

Algorithm Type

Component

Description

Recall algorithm

KScaNN

Kunpeng Scalable Nearest Neighbors (KScaNN) is a vector retrieval algorithm built on inverted file index (IVF). It uses the Kunpeng architecture to deeply optimize the index layout, algorithm process, and computing process, fully unleashing the chip potential.

KBest

Kunpeng Blazing-fast embedding similarity search thruster (KBest) is an efficient, Huawei-developed graph search algorithm. It optimizes the performance and precision of the nearest neighbor search by using methods such as quantization and NUMA scheduling, which are used for multi-dimensional vector approximate nearest neighbor search.

KVecTurbo

Kunpeng Vector Turbo (KVecTurbo) is a vector retrieval acceleration component developed by Kunpeng and can be used together with the openGauss vector database. It quantifies and compresses high-dimensional vectors to quickly obtain the near neighbors of a query. In addition, KVecTurbo uses the SIMD instructions to accelerate distance calculation for multidimensional vector nearest neighbor search.

KRL

Kunpeng Retrieval Library (KRL) is an operator library optimized for the Kunpeng platform to accelerate vector retrieval. KRL can accelerate Faiss-supported algorithms such as HNSW, PQFS, IVFPQ, and IVFPQFS by replacing operators.

KNewPfordelta

Kunpeng New PForDelta (KNewPfordelta) is an efficient IVF decompression algorithm. It accelerates the retrieval stage by leveraging vector instructions and other optimizations.

hnswlib

The open source hnswlib algorithm library has been deeply optimized for the Kunpeng platform. It delivers FP16 support through vectorization, and leverages optimization policies such as prefetching and instruction rescheduling.

Faiss

The open source Faiss algorithm library has been deeply optimized using key technologies such as vectorization, dimension-interleaved lookup and accumulation, and vector filtering and compression. These enhancements significantly improve the similarity search and clustering efficiency across IVFFlat, IVFPQ, HNSW, PQFS, and IVFPQFS indexing algorithms.

Ranking-focused AI inference operator library

KDNN

KDNN, as a deep neural network (DNN) operator library, optimizes the performance of AI operators based on the microarchitecture features of the Kunpeng processor and software optimization methods. It is integrated into the open source oneDNN software as an operator library plugin.

KDNN_EXT

KDNN_EXT, as the extension library of KDNN, optimizes operators such as softmax and random_choice and encapsulates them into a Python interface for users to call.

KTFOP

Kunpeng TensorFlow Operator (KTFOP) is an efficient, Huawei-developed TensorFlow operator library. It uses single instruction multiple data (SIMD) instructions and multi-core scheduling to accelerate operator processing in CPUs and reduce the usage of CPU computing resources, thereby increasing the overall end-to-end throughput of online inference.

ANNC

TensorFlow leverages the Accelerated Neural Network Compiler (ANNC) to perform graph-level optimizations, enhancing inference performance in recommendation systems. ANNC provides optimization technologies including computational graph optimization, and generation and integration of high-performance fused operators.