Rate This Document
Findability
Accuracy
Completeness
Readability

Introduction

This document provides the installation guide, interface definitions, and sample code of SRA_Recall to help you quickly get started with it.

SRA_Recall Overview

SRA_Recall is a recall algorithm library provided by Huawei and optimized based on the Kunpeng computing platform. It optimizes the instruction set architecture and memory access mechanism of the Kunpeng processor at the bottom layer, improving the computing efficiency and throughput of the recall algorithm. It is especially suitable for high-concurrency recall scenarios.

Table 1 describes the composition of SRA_Recall.

Table 1 SRA_Recall composition

Algorithm

Description

Application Scenario

KBest

Kunpeng Blazing-fast embedding similarity search thruster (KBest) is an efficient, Huawei-developed graph search algorithm. It optimizes the performance and precision of the nearest neighbor search by using methods such as quantization and NUMA scheduling, which are used for multi-dimensional vector approximate nearest neighbor search.

Applicable to various application fields of vector retrieval, including network search, multi-modal search, recommendation system, advertisement placement, and retrieval-augmented generation (RAG).

KScaNN

Kunpeng Scalable Nearest Neighbors (KScaNN) is a vector retrieval algorithm that is based on inverted indexes. It uses the Kunpeng architecture to deeply optimize the index layout, algorithm process, and computing process, fully unleashing the chip potential.

KVecTurbo

KVecTurbo is a vector retrieval acceleration component developed by Kunpeng and can be used together with the openGauss vector database. It quantifies and compresses high-dimensional vectors to quickly obtain the near neighbors of a query. In addition, KVecTurbo uses the SIMD instructions to accelerate distance calculation for multidimensional vector nearest neighbor search.

KNewPfordelta

KNewPfordelta is an integer compression algorithm optimized by Kunpeng based on the open source PForDelta algorithm. It is designed for efficient compression and fast decompression of inverted indexes. KNewPfordelta leverages block-based processing, exception handling, and SIMD acceleration to achieve an optimal balance between storage costs and query performance in inverted index compression. It is widely used in search engines, recommendation systems, and other scenarios that require fast processing of large-scale ordered integer sequences, such as document ID lists, and term frequencies and positions.

hnswlib

hnswlib is an efficient retrieval algorithm based on Hierarchical Navigable Small World (HNSW) graphs. Huawei has optimized the open source hnswlib algorithm library for the Arm architecture. It delivers FP16 support through vectorization, and leverages optimization strategies such as prefetching and instruction rescheduling.

Faiss

The open source Faiss algorithm library has been deeply optimized using key technologies such as vectorization, dimension-interleaved lookup and accumulation, and vector filtering and compression. These enhancements significantly improve the similarity search and clustering efficiency across IVFFlat, IVFPQ, HNSW, PQFS, and IVFPQFS indexing algorithms.

Elasticsearch Overview

Elasticsearch is a distributed search engine that features high scalability, reliability, and ease of management. It is built based on Apache Lucene and supports full-text retrieval, structured retrieval, and analytics. It can integrate these three functions within a single query. Elasticsearch is widely used in log management, real-time data analysis, and full-text retrieval.

Elasticsearch has the following features:

  • Distributed architecture: It supports horizontal scaling and can handle massive volumes of data.
  • Real-time search: Data becomes searchable immediately after it is written.
  • High availability: Data replication and failover mechanisms keep the system resilient and accessible.
  • Flexible query language: A wide range of search and aggregation operations are supported.
  • RESTful APIs: Easy-to-use RESTful APIs are provided, supporting multiple programming languages.

Application Scenarios

SRA_Recall is suited for the following scenarios:

  • Search: network search and multi-modal search
  • Recommendation: recommendation systems
  • Advertising: advertisement placements