Kunpeng Home
中文
Register
Kunpeng BoostKit for SRA

Kunpeng BoostKit for SRA

Kunpeng BoostKit for Search, Recommendation, and Advertising (SRA) provides a full-stack acceleration solution for Internet services based on the Kunpeng platform. It covers the core search algorithms in recall scenarios, and the full-stack software and core AI operator library of TensorFlow for model inference in ranking scenarios.

Ecosystem adaptation

  • Recall scenarios: Adapt to the open source ScaNN search algorithm based on Kunpeng 920 series processors, and provide the porting guide
  • Ranking inference scenarios: Adapt to the open source TensorFlow, PyTorch, and oneDNN software based on Kunpeng 920 series processors, and provide the porting guide


Scenario-based acceleration

  • KBest: The recall image search algorithm optimizes query perception and image re-ranking, and integrates with hardware optimizations such as vectorized instructions, assembly, and prefetch
  • ScaNN: The recall vector retrieval algorithm optimizes dynamic library inlining, low- bit quantization, retrieval operators, and vectorized instructions
  • KVecturbo: This acceleration component of the recall vector retrieval algorithm library quantizes and compresses high-dimensional vectors, and speeds up distance calculation using SIMD instructions.
  • KTFOP: The guidance documents are provided for deploying TensorFlow inference models on Kunpeng 920 series processors


Basic acceleration

  • KAIL: The artificial intelligence library optimizes core operators such as Matmul and Conv by using Kunpeng vectorized instructions, prefetch, and more. With oneDNN interconnected to provide complete capabilities, the performance is improved by 10%
  • KML: The math library provides math APIs such as GEMM and GEMV, covering basic math libraries such as BLAS, LIBM, and FFT


Base software

  • OS: openEuler
  • Compiler: GCC dynamic library inlining, automatic vectorization, etc


Application Scenarios
Search
Centering around the entered keywords, help users find target information based on relevance
Recommendation
Recommend personalized content or offerings based on users' historical behaviors and interest profiles
Advertising
As a tool of business communication, deliver search and recommendation ads to audiences using related technologies through multiple channels and forms, empowering targeted advertising and traffic monetization

Acceleration Features

Kunpeng Recall Algorithm Library
SRA_Recall is a recall algorithm library provided by Huawei and optimized based on the Kunpeng platform.
The Kunpeng Blazing-fast embedding similarity search thruster (KBest) is an efficient, Huawei-developed image search algorithm. In multi-dimensional vector approximate nearest neighbor searches, KBest employs methods such as quantization and vector instructions, to optimize the search performance and precision.
View Document
Benefits
Provides the search capability benchmarking against Faiss HNSW.
Key Technologies
Vectorized instructions and quantization technologies of Kunpeng hardware.
Application Scope
Network search, multi-modal search, recommendation system, and retrieval-augmented generation (RAG).

Open Source Enablement

Category
SoftwareVersionOperating SystemSource PackagePorting Guide
oneDNN
3.3.3
openEuler 22.03 LTS SP3v3.3.3.tar.gzoneDNN Porting Guide
PyTorch
2.1.2
openEuler 22.03 LTS SP3v2.1.2.tar.gzPyTorch Porting Guide
TensorFlow
1.15.5
2.13.0
openEuler 22.03 LTS SP3v2.13.0.zipv1.15.5.zipTensorFlow Porting Guide
TensorFlow Serving
2.15.0
openEuler 22.03 LTS SP32.15.0-rc0.tar.gzTensorFlow Serving Inference Deployment Framework Porting Guide
TVM
0.9.0
openEuler 22.03 LTS SP3apache-tvm-src-v0.9.0.tar.gzTVM Porting Guide

Performance Evaluation

Category
ComponentVersionOperating SystemSource PackageTest Guide
Inference Performance Benchmark Testing for Search and Inference Models
1.0.0
openEuler 22.03 LTS SP3
-
Test Guide