Introduction
SRA_Inference is an inference acceleration kit provided by Huawei and optimized based on the Kunpeng platform. This document provides the installation guide, interface definitions, and sample code of SRA_Inference to help you quickly get started with it.
SRA_Inference Overview
Table 1 describes the composition of SRA_Inference.
Component |
Description |
Application Scenario |
|---|---|---|
KTFOP |
Kunpeng TensorFlow Operator (KTFOP) is an efficient, Huawei-developed TensorFlow operator library. It uses single instruction multiple data (SIMD) instructions and multi-core scheduling to accelerate operator processing in CPUs and reduce the usage of CPU computing resources, thereby increasing the overall end-to-end throughput of online inference. |
Inference computing tasks on TensorFlow |
SRA_Inference is available only for Kunpeng series processors.
- Kunpeng 920 7260 (128 cores), supporting NEON instructions (128-bit width)
- New Kunpeng 920 processor model, supporting NEON instructions (128-bit width) and Scalable Vector Extension (SVE) instructions (256-bit width)
Application Scenarios
SRA_Inference is suited for the following scenarios:
- Recommendation: recommendation systems
- Advertising: advertisement placements