Rate This Document
Findability
Accuracy
Completeness
Readability

Kunpeng BoostKit for SRA

Getting Started

  • What's new

    Provides the latest updates in documents of Kunpeng BoostKit for SRA.

  • Technical white paper

    Describes the solution architecture, advantages, and key features of Kunpeng BoostKit for SRA.

  • List of Fixed Vulnerabilities

    Provides the list of fixed vulnerabilities in open-source and third-party software involved in the Kunpeng BoostKit software packages.

Acceleration

  • Kunpeng Recall Algorithm Library

    It optimizes the instruction set architecture and memory access mechanism of the Kunpeng processor at the bottom layer, improving the computing efficiency and throughput of the recall algorithm. It is especially suitable for high-concurrency recall scenarios.

  • Kunpeng Inference Acceleration Kit

    The Kunpeng Inference Acceleration Kit includes the Kunpeng TensorFlow operator library (KTFOP) and the Kunpeng ONNX Runtime operator library (KONNX).

  • Kunpeng AI Library

    The Kunpeng Artificial Intelligence Library (KAIL) is a high-performance AI operator library optimized for the Kunpeng platform. It includes a deep neural network operator library (KDNN) and an extension operator library (KDNN_EXT).

  • Kunpeng Retrieval Library

    This library is optimized for the Kunpeng platform to accelerate vector retrieval. It optimizes the instruction set architecture and memory access mechanism of the Kunpeng processor at the bottom layer. By combining low-precision quantization with high-precision reranking, the library significantly improves the computational efficiency and throughput of recall algorithms without compromising accuracy. These optimizations make it suitable for high-concurrency recall scenarios.

  • TensorFlow Serving Thread Scheduling Optimization

    Kunpeng BoostKit developed a thread scheduling optimization solution to enhance TensorFlow Serving inference performance.

  • TensorFlow Serving ANNC Feature

    An extended acceleration suite. It is built on open source OpenXLA, and hosted in the ANNC repository maintained by the openEuler community. The suite includes optimizations tailored for the Kunpeng platform, such as TensorFlow graph fusion, Accelerated Linear Algebra (XLA) graph fusion, and operator optimization.

Open Source Enablement

  • oneDNN

    Guide for porting the oneDNN deep neural network library.

  • PyTorch

    Guide for porting the PyTorch open-source deep learning framework.

  • TensorFlow

    Guide for porting the TensorFlow deep learning framework.

  • TensorFlow Serving

    Guide for porting TensorFlow Serving, a high-performance system for serving machine learning models.

  • ScaNN

    Guide for porting ScaNN, an open-source vector similarity search library.

  • DLRM

    Guide for porting the DLRM deep learning recommendation model.

  • TVM

    Guide for porting TVM, an open-source deep learning compiler stack.

  • ONNX Runtime

    Guide for porting ONNX Runtime, a high-performance cross-platform engine for accelerating model inference in the ONNX format.

Performance Evaluation

Historical Versions