Rate This Document
Findability
Accuracy
Completeness
Readability

Recall Scenario

Scenario Architecture

Figure 1 Context connection in the Kunpeng recall scenario

Vector Retrieval Algorithm KScaNN

KScaNN is a vector retrieval algorithm built on IVF. It uses the Kunpeng architecture to deeply optimize the index layout, algorithm process, and computing process, fully unleashing the chip potential. Developed on the open source ScaNN algorithm, Kunpeng provides five types of optimization: query awareness, PQ distance algorithm, rearrangement algorithm, efficient compiler used for built-in instructions and NEON instructions, and system optimization (including optimization in thread quantity, prefetch, and batch size). KScaNN is integrated into the Milvus vector database.

Vector Retrieval Algorithm KBest

KBest is a high-performance near neighbor search algorithm developed on Navigating Spreading-out Graph (NSG). It is used for multidimensional vector approximate nearest neighbor search. Compared to the NSG algorithm, KBest further optimizes the performance and precision of the near neighbor search. Before search, KBest obtains the optimal prefetch parameters of the current graph structure through prefetch parameter optimization. During search, KBest quickly approaches the query point with the help of the entry point pre-stored in the graph index, identifies the query point's near neighbors, and accelerates distance calculation using the SIMD instructions. When the search is complete, KBest performs full-precision or half-precision rearrangement to improve the ranking precision and returns the k-nearest neighbors. Compared to the open source ANNS algorithm, KBest achieves a significant performance improvement. Figure 2 shows the search principle of the KBest algorithm. KBest is integrated into the Milvus vector database.

Figure 2 KBest algorithm search principle

Kunpeng provides five types of optimization: high-performance upper-layer algorithms, integration of multiple quantizers, adaptable SIMD instructions that can optimize the performance of each platform, full-precision or half-precision rearrangement at the end of the search to improve the ranking precision, and optimal prefetch parameters of graph structure obtained by prefetch parameter optimization before search.

Vector Retrieval Acceleration Component KVecTurbo

KVecTurbo is a vector retrieval acceleration component developed by Kunpeng and can be used together with the openGauss vector database. It quantifies and compresses high-dimensional vectors to quickly obtain the near neighbors of a query. In addition, KVecTurbo uses the SIMD instructions to accelerate distance calculation for multidimensional vector nearest neighbor search.

Kunpeng provides three types of optimization: vectorized instructions with Kunpeng affinity, path compression, and low-bit quantization.

Kunpeng Retrieval Library KRL

KRL is an operator library optimized based on the Kunpeng platform to accelerate vector retrieval. KRL can accelerate Faiss-supported algorithms such as HNSW, PQFS, IVFPQ, and IVFPQFS by replacing operators.

Kunpeng provides the following optimizations: vectorized instruction acceleration, memory layout adjustment to improve cache hit rate, and low-precision quantization + high-precision reranking.

KNewPfordelta Library

Kunpeng New PForDelta (KNewPfordelta) is an efficient IVF decompression algorithm. It accelerates the retrieval stage by leveraging vector instructions and other optimizations.

Kunpeng provides this optimization: vectorized instructions and other technologies based on Kunpeng hardware.

Faiss Library

The open source Faiss algorithm library has been deeply optimized using key technologies such as vectorization, dimension-interleaved lookup and accumulation, and vector filtering and compression. These enhancements significantly improve the similarity search and clustering efficiency across IVFFlat, IVFPQ, HNSW, PQFS, and IVFPQFS indexing algorithms.

Kunpeng provides the following optimizations: vectorized instruction acceleration, memory layout adjustment to improve cache hit rate, and low-precision quantization + high-precision reranking.

hnswlib Library

The open source hnswlib algorithm library has been deeply optimized for the Kunpeng platform. It delivers FP16 support through vectorization, and leverages optimization policies such as prefetching and instruction rescheduling.

Enhancements tailored for the Kunpeng platform: vectorized instructions and other technologies based on Kunpeng hardware.