BoostSRA

BoostSRA aims to deliver high-performance application-layer acceleration for search, recommendation, and advertising (SRA) services. It features advanced retrieval algorithms for recall scenarios and optimized model inference frameworks to enhance ranking performance.

Getting Started

What's new
Provides the latest updates in documents of BoostSRA.

Kunpeng Recall Algorithm Library

A recall algorithm library optimized for the Kunpeng platform. It optimizes the instruction set architecture and memory access mechanism of the Kunpeng processor at the bottom layer, improving the computing efficiency and throughput of the recall algorithm. It is especially suitable for high-concurrency recall scenarios.

KBest
A proprietary, efficient graph-based search algorithm. It provides the search capability benchmarking against Faiss HNSW.
KScaNN
An inverted index-based vector retrieval algorithm. It uses the Kunpeng architecture to deeply optimize index layout, algorithmic logic, and computing process to fully unlock the chip potential.
KVecTurbo
A proprietary vector retrieval acceleration component. It quantifies and compresses high-dimensional vectors to quickly obtain the near neighbors of a query. In addition, KVecturbo uses the SIMD instructions to accelerate distance calculation for multidimensional Nearest Neighbor Search (NNS). It can work with the openGauss vector database.
KNewPfordelta
An integer compression algorithm optimized by Kunpeng based on the open-source PForDelta algorithm. It is designed for efficient compression and fast decompression of inverted indexes.
Kunpeng hnswlib
It delivers FP16 support by deeply optimizing the open-source Hierarchical Navigable Small World library (hnswlib) for the Kunpeng Arm platform.
Kunpeng Faiss
Based on the open-source Faiss library, the IVFFlat, IVFPQ, HNSW, PQFS, and IVFPQFS indexes are optimized through vectorization, dimension-interleaved table lookups, and vector filtering compression to accelerate performance in similarity search and clustering. In addition, FP16 support has been added for the HNSW index type.
Kunpeng RaBitQ
Intrusive modifications are made to the open-source RaBitQ codebase. This extends its support to the AArch64 architecture, introducing performance optimizations and functional enhancements. The optimizations include FP16 precision optimization, NEON SIMD vectorization, assembly-level Lookup Table (LUT) acceleration, Spilling with Orthogonality-Amplified Residuals (SOAR) spilled vector assignment, and ML-based adaptive nprobe.

Kunpeng Retrieval Library

Kunpeng Retrieval Library (KRL)
An operator library optimized for the Kunpeng platform to accelerate vector retrieval. KRL can accelerate Faiss-supported algorithms such as HNSW, PQFS, IVFPQ, and IVFPQFS by replacing operators.

Kunpeng Inference Acceleration Kit

SRA_Inference
An inference acceleration kit optimized for the Kunpeng platform. It includes the Kunpeng TensorFlow operator library (KTFOP) and the Kunpeng ONNX Runtime operator library (KONNX).

TensorFlow Inference Optimization

TensorFlow inference optimization
A high-performance inference acceleration extension based on open-source TensorFlow. It focuses on efficient execution in SRA inference scenarios. It significantly improves throughput and cuts latency for model inference through in-depth enhancements in graph optimization, operators, and runtime, providing top performance for AI applications based on Kunpeng CPUs.