Rate This Document
Findability
Accuracy
Completeness
Readability

Introduction

SRA_Inference is an inference acceleration kit provided by Huawei and optimized based on the Kunpeng platform. This document provides the installation guide, interface definitions, and sample code of SRA_Inference to help you quickly get started with it.

SRA_Inference Overview

Table 1 describes the composition of SRA_Inference.

Table 1 SRA_Inference composition

Component

Description

Application Scenario

KTFOP

Kunpeng TensorFlow Operator (KTFOP) is an efficient, Huawei-developed TensorFlow operator library. It uses single instruction multiple data (SIMD) instructions and multi-core scheduling to accelerate operator processing in CPUs and reduce the usage of CPU computing resources, thereby increasing the overall end-to-end throughput of online inference.

Inference computing tasks on TensorFlow

SRA_Inference is available only for Kunpeng series processors.

  • Kunpeng 920 7260 (128 cores), supporting NEON instructions (128-bit width)
  • New Kunpeng 920 processor model, supporting NEON instructions (128-bit width) and Scalable Vector Extension (SVE) instructions (256-bit width)

Application Scenarios

SRA_Inference is suited for the following scenarios:

  • Recommendation: recommendation systems
  • Advertising: advertisement placements