Kunpeng BoostKit for SRA-Kunpeng Community

Kunpeng BoostKit for SRA

Kunpeng BoostKit for Search, Recommendation, and Advertising (SRA) provides a full-stack acceleration solution for Internet services based on the Kunpeng platform. It covers the core search algorithms in recall scenarios, and the full-stack software and core AI operator library of TensorFlow for model inference in ranking scenarios.

Ecosystem adaptation

Recall scenarios: Adapt to the open source ScaNN search algorithm based on Kunpeng 920 series processors, and provide the porting guide
Ranking inference scenarios: Adapt to the open source TensorFlow, PyTorch, and oneDNN software based on Kunpeng 920 series processors, and provide the porting guide

Scenario-based acceleration

KBest: The recall image search algorithm optimizes query perception and image re-ranking, and integrates with hardware optimizations such as vectorized instructions, assembly, and prefetch
ScaNN: The recall vector retrieval algorithm optimizes dynamic library inlining, low- bit quantization, retrieval operators, and vectorized instructions
KVecturbo: This acceleration component of the recall vector retrieval algorithm library quantizes and compresses high-dimensional vectors, and speeds up distance calculation using SIMD instructions.
KTFOP: The guidance documents are provided for deploying TensorFlow inference models on Kunpeng 920 series processors

Basic acceleration

KAIL: The artificial intelligence library optimizes core operators such as Matmul and Conv by using Kunpeng vectorized instructions, prefetch, and more. With oneDNN interconnected to provide complete capabilities, the performance is improved by 10%
KML: The math library provides math APIs such as GEMM and GEMV, covering basic math libraries such as BLAS, LIBM, and FFT

Base software

OS: openEuler
Compiler: GCC dynamic library inlining, automatic vectorization, etc

Application Scenarios

Search

Centering around the entered keywords, help users find target information based on relevance

Recommendation

Recommend personalized content or offerings based on users' historical behaviors and interest profiles

Advertising

As a tool of business communication, deliver search and recommendation ads to audiences using related technologies through multiple channels and forms, empowering targeted advertising and traffic monetization

Acceleration Features

Kunpeng Recall Algorithm Library

SRA_Recall is a recall algorithm library provided by Huawei and optimized based on the Kunpeng platform.

The Kunpeng Blazing-fast embedding similarity search thruster (KBest) is an efficient, Huawei-developed image search algorithm. In multi-dimensional vector approximate nearest neighbor searches, KBest employs methods such as quantization and vector instructions, to optimize the search performance and precision.

View Document

Benefits

Provides the search capability benchmarking against Faiss HNSW.

Key Technologies

Vectorized instructions and quantization technologies of Kunpeng hardware.

Application Scope

Network search, multi-modal search, recommendation system, and retrieval-augmented generation (RAG).

Open Source Enablement

Software	Version	Operating System	Source Package	Porting Guide
oneDNN	3.3.3	openEuler 22.03 LTS SP3	v3.3.3.tar.gz	oneDNN Porting Guide
PyTorch	2.1.2	openEuler 22.03 LTS SP3	v2.1.2.tar.gz	PyTorch Porting Guide
TensorFlow	1.15.5 2.13.0	openEuler 22.03 LTS SP3	v2.13.0.zip v1.15.5.zip	TensorFlow Porting Guide
TensorFlow Serving	2.15.0	openEuler 22.03 LTS SP3	2.15.0-rc0.tar.gz	TensorFlow Serving Inference Deployment Framework Porting Guide
TVM	0.9.0	openEuler 22.03 LTS SP3	apache-tvm-src-v0.9.0.tar.gz	TVM Porting Guide

Performance Evaluation

Component	Version	Operating System	Source Package	Test Guide
Inference Performance Benchmark Testing for Search and Inference Models	1.0.0	openEuler 22.03 LTS SP3	-	Test Guide

Support and Help

Kunpeng BoostKit for SRA Documentation

Find everything you need to know about the Kunpeng BoostKit for SRA