Ranking Inference Scenario

Scenario Architecture

Figure 1 Context interconnection in the Kunpeng ranking inference scenario

KDNN

Based on the microarchitecture features of the Kunpeng processor, Kunpeng Deep Neural Network Library (KDNN) improves the performance of core DNN operators through vectorization, assembly, and algorithm optimization, and can be integrated into the open-source oneDNN library as a plugin to provide complete capabilities. In addition, it supports interfacing with operators such as TensorFlow Matmul and softmax.

KDNN_EXT

KDNN_EXT serves as the extension library of KDNN. It optimizes operators such as softmax and random_choice, and encapsulate them into a Python interface library for specific AI scenarios.

Kunpeng Inference Acceleration Kit KTFOP

KTFOP is an efficient, Huawei-developed TensorFlow operator library. It uses SIMD instructions and multi-core scheduling to accelerate operator processing in CPUs and reduce the usage of CPU computing resources, thereby increasing the overall end-to-end throughput of online inference.

Kunpeng AI Compiler ANNC

TensorFlow leverages the Accelerated Neural Network Compiler (ANNC) to perform graph-level optimizations, enhancing inference performance in recommendation systems. ANNC provides optimization technologies including computational graph optimization, and generation and integration of high-performance fused operators.

Parent topic: Features