Recall Scenario

Scenario Architecture

Figure 1 Context connection in the Kunpeng recall scenario

Vector Retrieval Algorithm KScaNN

KScaNN is an inverted index-based vector retrieval algorithm that deeply optimizes index layout, algorithmic logic, and computing process to fully unlock the chip potential. Developed on the open-source ScaNN algorithm, Kunpeng provides five types of optimization: query awareness, PQ distance algorithm, rearrangement algorithm, efficient compiler used for built-in instructions and NEON instructions, and system optimization (including optimization in thread quantity, prefetch, and batch size). KScaNN is integrated into the Milvus vector database.

Vector Retrieval Algorithm KBest

KBest is a high-performance near neighbor search algorithm developed on Navigating Spreading-out Graph (NSG). It is used for multidimensional vector approximate nearest neighbor search. Compared to the NSG algorithm, KBest further optimizes the performance and precision of the near neighbor search. Before search, KBest obtains the optimal prefetch parameters of the current graph structure through prefetch parameter optimization. During search, KBest quickly approaches the query point with the help of the entry point pre-stored in the graph index, identifies the query point's near neighbors, and accelerates distance calculation using SIMD instructions. When the search is complete, KBest performs full-precision or half-precision rearrangement to improve the ranking precision and returns the k-nearest neighbors. Compared to the open-source ANNS algorithm, KBest achieves a significant performance improvement. Figure 2 shows the search principle of the KBest algorithm. KBest is integrated into the Milvus vector database.

Figure 2 KBest algorithm search principle

Kunpeng provides five types of optimization: high-performance upper-layer algorithms, integration of multiple quantizers, adaptable SIMD instructions that can optimize the performance of each platform, full-precision or half-precision rearrangement at the end of the search to improve the ranking precision, and optimal prefetch parameters of graph structure obtained by prefetch parameter optimization before search.

Vector Retrieval Acceleration Component KVecTurbo

KVecTurbo is a vector retrieval acceleration component developed by Kunpeng and can be used together with the openGauss vector database. It quantifies and compresses high-dimensional vectors to quickly obtain the near neighbors of a query. In addition, KVecTurbo uses SIMD instructions to accelerate distance calculation for multidimensional vector nearest neighbor search.

Kunpeng provides three types of optimization: vectorized instructions with Kunpeng affinity, path compression, and low-bit quantization.

Kunpeng Retrieval Library KRL

KRL is an operator library optimized based on the Kunpeng platform to accelerate vector retrieval. KRL can accelerate Faiss-supported algorithms such as HNSW, PQFS, IVFPQ, and IVFPQFS by replacing operators.

Kunpeng provides the following optimizations: vectorized instruction acceleration, memory layout adjustment to improve cache hit rate, and low-precision quantization + high-precision reranking.

KNewPfordelta Library

KNewPfordelta is a vectorized decompression algorithm that optimizes inverted index processing for superior search performance.

Enhancements tailored for the Kunpeng platform include vectorized instructions and other technologies based on Kunpeng hardware.

Faiss Library

The open-source Faiss algorithm library has been deeply optimized using key technologies such as vectorization, dimension-interleaved lookup and accumulation, and vector filtering and compression. In addition, FP16 interface support has been added for the hnsw algorithm. These enhancements significantly improve the similarity search and clustering efficiency across IVFFlat, IVFPQ, HNSW, PQFS, and IVFPQFS indexing algorithms.

Kunpeng provides the following optimizations: vectorized instruction acceleration, memory layout adjustment to improve cache hit rate, and low-precision quantization + high-precision reranking.

hnswlib

The open-source hnswlib has been deeply optimized for the Kunpeng platform. It delivers efficient FP16 support through vectorization, and leverages optimization policies such as prefetching and instruction rescheduling.

Enhancements tailored for the Kunpeng platform include vectorized instructions and other technologies based on Kunpeng hardware.

Embeddinglookup Library

Based on the open-source Monolith large-scale real-time recommendation system, its core Embedding Lookup module has been deeply adapted and optimized.

Enhancements tailored for the Kunpeng platform include compilation option tuning, spinlock tuning, memory alignment tuning, and Arm SIMD vectorization.

RaBitQ Library

The open-source RaBitQ algorithm library has been extended to the Arm64 (AArch64) architecture, introducing multiple algorithmic optimizations.

Enhancements tailored for the Kunpeng platform include FP16 precision optimization, LUT acceleration, SOAR vector allocation, and ML-based adaptive nprobe.

Parent topic: Features