Rate This Document
Findability
Accuracy
Completeness
Readability

API Reference

KDNN Operators

Operator Range

Table 1 Operators available in KDNN lists the operators available in KDNN.

Table 1 Operators available in KDNN

Operator

Description

Eltwise

Element-wise operator

Layer Normalization

Layer normalization operator

Inner Product

Matrix inner product operator

Softmax

Softmax normalization operator

Sum

Sum operator

Matmul

Matrix multiplication operator

Convolution

Convolution operator

Deconvolution

Deconvolution operator

Concat

Concatenation operator

Resampling

Data resampling operator

Shuffle

Data shuffling operator

Reorder

Data reordering operator

Pool

Pooling operator

Batch Normalization (bnormal)

Batch normalization operator

Local Response Normalization (lrn)

Local response normalization operator

Reduction

Reduction operator

PReLU

Activation operator (Leaky ReLU) with a trainable alpha parameter

Binary

Binary primitive computation tensor operator

RNN

Recurrent neural network operator

Group Normalization

Group normalization operator

SparseGemm

Sparse matrix multiplication operator

Operator Description

Eltwise

Function Description

Function

Performs operations of the same type on each element in a tensor, including abs, exp, and log.

Formula

Table 1 Operation types describes the operations supported by the Eltwise operator.

Table 1 Operation types

Operation

Propagation Direction

Formula

abs

Forward

exp

Forward

log

Forward

sqrt

Forward

round

Forward

tanh

Forward

relu

Forward

elu

Backward

logistic

Forward

logistic

Backward

linear

Forward

linear

Backward

Table 2 Formula parameters describes the meanings of the symbols in the preceding formulas.

Table 2 Formula parameters

Parameter

Description

s

Element in the src tensor

d

Element in the dst tensor

ds

Element in the diff_src tensor

dd

Element in the diff_dst tensor

α, β

Input parameters α and β of the constant floating-point type

Feature Scope

Propagation Directions and Data Types

Operation

Data Type

Supported Propagation Direction

abs

f32

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

exp

f32

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

log

f32

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

sqrt

f32

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

round

f32

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

tanh

f32

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

relu

f32/f16/bf16

  • Forward:
    • dnnl_forward_training
    • dnnl_forward_inference
  • Backward: dnnl_backward_data

logistic

f32/f16/bf16

  • Forward:
    • dnnl_forward_training
    • dnnl_forward_inference
  • Backward: dnnl_backward_data

linear

f32/f16/bf16

  • Forward:
    • dnnl_forward_training
    • dnnl_forward_inference
  • Backward: dnnl_backward_data

Dimensions and Data Layout

The Eltwise operator in KDNN supports 1D to 5D and sequential data layout.

Table 1 Mapping between each tensor dimension and parameter data layout

Tensor Dimension

Input Tensor (src) Data Layout

Output Tensor (dst) Data Layout

1D

dnnl_a

dnnl_a

2D

dnnl_ab

dnnl_ab

3D

dnnl_abc

dnnl_abc

4D

dnnl_abcd

dnnl_abcd

5D

dnnl_abcde

dnnl_abcde

Layer Normalization

Function Description

Function

Performs layer normalization.

Formula

The formula of the layer normalization operator in the case of three dimensions is as follows:

The mean and variance can be calculated at run time or provided by the user. To calculate the two values at run time, use the following formulas:

Table 1 Formula parameters describes the parameters in the formulas.

Table 1 Formula parameters

Parameter

Description

Scale

Shift

Mean

Variance

Constant, used to improve numerical stability

Feature Scope

Data Types

Table 1 Parameter data types

Input Data (src) Type

Output Data (dst) Type

Scale and Shift Data Type

f32

f32

f32

f16

f16

f16

bf16

bf16

bf16

Propagation Directions and Flags

flag

Propagation Direction

dnnl_normalization_flags_none (default normalization)

  • Forward:
    • dnnl_forward_training
    • dnnl_forward_inference
  • Backward:
    • dnnl_backward_data
    • dnnl_backward

dnnl_use_global_stats (global statistics)

  • Forward:
    • dnnl_forward_training
    • dnnl_forward_inference
  • Backward:
    • dnnl_backward_data
    • dnnl_backward

dnnl_use_scale (enabling the scale parameter)

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

dnnl_use_shift (enabling the shift parameter)

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

dnnl_use_global_stats | dnnl_use_scale | dnnl_use_shift

Forward:

  • dnnl_forward_training
  • dnnl_forward_inference

Data Layout

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension

Input Tensor (src) Data Layout

Output Tensor (dst) Data Layout

Tensor Data Layout for Mean and Variance

2D

dnnl_ab

dnnl_ab

dnnl_a

3D

dnnl_abc

dnnl_abc

dnnl_ab

4D

dnnl_abcd

dnnl_abcd

dnnl_abc

5D

dnnl_abcde

dnnl_abcde

dnnl_abcd

Inner Product

Function Description

Function

Calculates the matrix inner product.

Formula

In a 2D case, the formula for calculating the matrix inner product is as follows:

High-dimensional tensors are flattened into 2D tensors for calculation.

Table 1 Formula parameters

Number of

Parameter

Description

n

Batch number

ic

Number of input channels

oc

Number of output channels

src

Input tensor

weights

Weight tensor

bias

Bias tensor

dst

Output result tensor

Feature Scope

Propagation Directions

Propagation Direction

Specific Category

Forward

dnnl_forward_training

dnnl_forward_inference

Backward

dnnl_backward_data

dnnl_backward_weights

Data Types

Table 1 Parameter data types of the forward direction

Data type of the

Input Data (src) Type

Weight Data Type

Output Data (dst) Type

Bias Data Type

f32

f32

f32

f32/none

f16

f16

f16

f16/none

bf16

bf16

bf16

bf16/none

f16

f16

f32

f32/none

bf16

bf16

f32

f32/none

Table 2 Parameter data types of the backward direction (dnnl_backward_data category)

Input Data (src) Type

Weight Data Type

Output Data (dst) Type

f32

f32

f32

f16

f16

f16

bf16

bf16

bf16

f32

f16

f16

f32

bf16

bf16

Table 3 Parameter data types of the backward direction (dnnl_backward_weights category)

Input Data (src) Type

Weight Data Type

Output Data (dst) Type

Bias Data Type

f32

f32

f32

f32/none

f16

f16

f16

f16/none

bf16

bf16

bf16

bf16/none

f16

f32

f16

f32/none

bf16

f32

bf16

f32/none

Softmax

Function Description

Function

Performs the Softmax function operation along a data dimension.

Formula

Table 1 Formula parameters

Dimension along which the

Parameter

Description

src

Input tensor

dst

Output tensor

c

Dimension of the softmax operation

The outermost dimension

The innermost dimension

Coefficient used to generate numerically stable output results. The coefficient is calculated using the following formula, where ic is all intermediate dimensions from the outermost to the innermost dimension.

Feature Scope

Propagation Directions

Propagation Direction

Specific Category

Forward

dnnl_forward_training

dnnl_forward_inference

Backward

dnnl_backward_data

Data Types

Table 1 Parameter data types

Input Data (src) Type

Output Data (dst) Type

f32

f32

f16

f16

bf16

bf16

Dimensions and Data Layout

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension

Input Tensor (src) Data Layout

Output Tensor (dst) Data Layout

1D

dnnl_a

dnnl_a

2D

dnnl_ab

dnnl_ab

3D

dnnl_abc

dnnl_abc

4D

dnnl_abcd

dnnl_abcd

5D

dnnl_abcde

dnnl_abcde

Sum

Function Description

Function

Calculates the sum of N tensors.

Formula

Table 1 Formula parameters

Parameter

Description

src

Input tensor

dst

Output tensor

scales

Scaling coefficient

Feature Scope

Data Types

Table 1 Parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

Data Layout

The KDNN Sum operator supports the following data layout:

  • The data dimension can be 1D to 5D.
  • Each of the N input tensors must have the same dimensions and data layout as the output tensor.

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension

src Data Layout

dst Data Layout

1D

dnnl_a

dnnl_a

2D

dnnl_ab

dnnl_ab

3D

dnnl_abc

dnnl_abc

3D

dnnl_acb

dnnl_acb

4D

dnnl_abcd

dnnl_abcd

4D

dnnl_acdb

dnnl_acdb

5D

dnnl_abcde

dnnl_abcde

5D

dnnl_acdeb

dnnl_acdeb

Matmul

Function Description

Function

Performs matrix multiplication.

Formula

  • 2D Tensor

  • High-dimensional tensor

    Table 1 Formula parameters

    Parameter

    Description

    src

    Input tensor

    weights

    Weight tensor

    bias

    Bias tensor

    dst

    Output result tensor

    m, n, k

    Height and width dimensions m, n, and k for input matrices A(m,k) and B(k,n), and the output matrix C(m,n).

Feature Scope

Data Types

Table 1 Parameter data types

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

f32

f32

f32

f32

f16

f16

f16

f16

bf16

bf16

bf16

bf16

f16

f16

f32

f32/f16

bf16

bf16

f32

bf16/f32

Data Layout

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension

src Data Layout

Weight Data Layout

dst Data Layout

2D

dnnl_ab/dnnl_ba

dnnl_ab/dnnl_ba

dnnl_ab/dnnl_ba

3D

dnnl_abc/dnnl_acb

dnnl_abc/dnnl_acb

dnnl_abc/dnnl_acb

4D

dnnl_abcd/dnnl_abdc

dnnl_abcd/dnnl_abdc

dnnl_abcd/dnnl_abdc

5D

dnnl_abcde/dnnl_abced

dnnl_abcde/dnnl_abced

dnnl_abcde/dnnl_abced

Convolution

Function Description

Function

Performs convolution.

Formula

General 2D convolution calculation formula:

Table 1 Parameter description

Parameter

Depth

Height

Width

Comment

Padding: front, top, and left

PDL

PHL

PWL

padding_l indicates the left-side padding of the corresponding vector.

Padding: back, bottom, and right

PDR

PHR

PWR

padding_r indicates the right-side padding of the corresponding vector.

Stride

SD

SH

SW

Stride, which can be set to 1 for continuous convolution.

Dilation

DD

DH

DW

Dilation value, which can be set to 0 for non-dilated convolution.

src

-

-

-

Input tensor

weights

-

-

-

Weight tensor

bias

-

-

-

Bias tensor

dst

-

-

-

Output result tensor

ic

-

-

-

Input channel

oc

-

-

-

Output channel

oh

-

-

-

Output height

ow

-

-

-

Output width

kw

-

-

-

Width of the convolution kernel

kh

-

-

-

Height of the convolution kernel

Feature Scope

Propagation Directions and Data Types

Table 1 Parameter data types of the forward direction

Propagation Direction

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

dnnl_forward_training

dnnl_forward_inference

f32

f32

f32

f32

dnnl_forward_training

dnnl_forward_inference

f16

f16

f16

f16

dnnl_forward_training

dnnl_forward_inference

bf16

bf16

bf16

bf16

dnnl_forward_training

dnnl_forward_inference

f16

f16

f32

f16

dnnl_forward_training

dnnl_forward_inference

bf16

bf16

f32

bf16

Table 2 Parameter data types of the backward direction (dnnl_backward_data category)

Propagation Direction

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

dnnl_backward_data

f32

f32

f32

f32

dnnl_backward_data

f16

f16

f16

f16

dnnl_backward_data

bf16

bf16

bf16

bf16

dnnl_backward_data

f32

f16

f16

f32

dnnl_backward_data

f32

bf16

bf16

f32

Table 3 Parameter data types of the backward direction (dnnl_backward_weights category)

Propagation Direction

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

dnnl_backward_weights

f32

f32

f32

f32

dnnl_backward_weights

f16

f16

f16

f16

dnnl_backward_weights

bf16

bf16

bf16

bf16

dnnl_backward_weights

f16

f32

f16

f16

dnnl_backward_weights

bf16

f32

bf16

bf16

Data Layout

2D convolution is supported. The input and output tensor dimension is 4D. The layout of src, weights, and dst data needs to meet the following requirements:

Tensor Dimension

src Data Layout

Weight Data Layout

dst Data Layout

4D

dnnl_abcd

dnnl_abcd

dnnl_abcd

Parameter Constraints

Propagation Direction

Variable Name

Variable Description

Constraint

Remarks

dnnl_forward_training

dnnl_forward_inference

mb

batch

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

ic

input channel

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

ih

input height

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

iw

input width

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

oc

output channel

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

kh

kernel height

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

kw

kernel width

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

oh

output height

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

ow

output width

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

sh

height-wise stride

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

sw

width-wise stride

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

dh

height-wise dilation

>=0

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

dw

width-wise dilation

>=0

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

ph

height padding

>=0

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

pw

width padding

>=0

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

DKH

kernel height with dilation

DKH = 1 + (kh-1) x (dh+1)

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_forward_training

dnnl_forward_inference

DKW

kernel width with dilation

DKW = 1 + (kw-1) x (dw+1)

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

mb

batch

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

ic

input channel

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

ih

input height

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

iw

input width

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

oc

output channel

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

kh

kernel height

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

kw

kernel width

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

oh

output height

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

ow

output width

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

sh

height-wise stride

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

sw

width-wise stride

>=1

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

dh

height-wise dilation

>=0

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

dw

width-wise dilation

>=0

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

ph

height padding

0<=ph<=(kh-1) x (dh+1)

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

pw

width padding

0<=pw<=(kw-1) x (dw+1)

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

DKH

kernel height with dilation

DKH = 1 + (kh-1) x (dh+1)

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_data

DKW

kernel width with dilation

DKW = 1 + (kw-1) x (dw+1)

oh and ow can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met:

[] represents rounding down to the nearest integer.

dnnl_backward_weights

mb

batch

>=1

dnnl_backward_weights

ic

input channel

>=1

dnnl_backward_weights

ih

input height

>=1

dnnl_backward_weights

iw

input width

>=1

dnnl_backward_weights

oc

output channel

>=1

dnnl_backward_weights

kh

kernel height

>=1

dnnl_backward_weights

kw

kernel width

>=1

dnnl_backward_weights

oh

output height

>=1

dnnl_backward_weights

ow

output width

>=1

dnnl_backward_weights

sh

height-wise stride

>=1

dnnl_backward_weights

sw

width-wise stride

>=1

dnnl_backward_weights

dh

height-wise dilation

>=0

dnnl_backward_weights

dw

width-wise dilation

>=0

dnnl_backward_weights

ph

height padding

>=0

dnnl_backward_weights

pw

width padding

>=0

dnnl_backward_weights

DKH

kernel height with dilation

DKH = 1 + (oh-1) x sh

dnnl_backward_weights

DKW

kernel width with dilation

DKW = 1 + (ow-1) x sw

Deconvolution

Function Description

Function

Performs deconvolution. Both forward and backward directions are supported.

Feature Scope

Data Types

Table 1 Parameter data types of the forward direction

Propagation Direction

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

dnnl_forward_training

dnnl_forward_inference

f32

f32

f32

f32

dnnl_forward_training

dnnl_forward_inference

f16

f16

f16

f16

dnnl_forward_training

dnnl_forward_inference

bf16

bf16

bf16

bf16

dnnl_forward_training

dnnl_forward_inference

f16

f16

f32

f16

dnnl_forward_training

dnnl_forward_inference

bf16

bf16

f32

bf16

Table 2 Parameter data types of the backward direction (dnnl_backward_data category)

Propagation Direction

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

dnnl_backward_data

f32

f32

f32

f32

dnnl_backward_data

f16

f16

f16

f16

dnnl_backward_data

bf16

bf16

bf16

bf16

dnnl_backward_data

f32

f16

f16

f32

dnnl_backward_data

f32

bf16

bf16

f32

Table 3 Parameter data types of the backward direction (dnnl_backward_weights category)

Propagation Direction

src Data Type

Weight Data Type

dst Data Type

Bias Data Type

dnnl_backward_weights

f32

f32

f32

f32

dnnl_backward_weights

f16

f16

f16

f16

dnnl_backward_weights

bf16

bf16

bf16

bf16

dnnl_backward_weights

f16

f32

f16

f16

dnnl_backward_weights

bf16

f32

bf16

bf16

Data Layout

2D transposed convolution is supported. The input and output data are 4D tensors. The data layout of src, weights, and dst must meet the following requirements:

Tensor Dimension

src Data Layout

Weight Data Layout

dst Data Layout

4D Tensor

abcd

abcd

abcd

Parameter Constraints

Propagation Direction

Variable Name

Variable Description

Constraint

Remarks

FWD_B, FWD_D, FWD_I

mb

batch

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

ic

input channel

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

ih

input height

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

iw

input width

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

oc

output channel

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

kh

kernel height

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

kw

kernel width

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

oh

output height

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

ow

output width

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

sh

height-wise stride

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

sw

width-wise stride

>=1

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

dh

height-wise dilation

>=0

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

dw

width-wise dilation

>=0

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

ph

height padding

0<=ph<=(kh-1) x (dh+1)

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

pw

width padding

0<=pw<=(kw-1) x (dw+1)

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

DKH

kernel height with dilation

DKH = 1 + (kh-1) x (dh+1)

[] represents rounding down to the nearest integer.

FWD_B, FWD_D, FWD_I

DKW

kernel width with dilation

DKW = 1 + (kw-1) x (dw+1)

[] represents rounding down to the nearest integer.

BWD_D

mb

batch

>=1

[] represents rounding down to the nearest integer.

BWD_D

ic

input channel

>=1

[] represents rounding down to the nearest integer.

BWD_D

ih

input height

>=1

[] represents rounding down to the nearest integer.

BWD_D

iw

input width

>=1

[] represents rounding down to the nearest integer.

BWD_D

oc

output channel

>=1

[] represents rounding down to the nearest integer.

BWD_D

kh

kernel height

>=1

[] represents rounding down to the nearest integer.

BWD_D

kw

kernel width

>=1

[] represents rounding down to the nearest integer.

BWD_D

oh

output height

>=1

[] represents rounding down to the nearest integer.

BWD_D

ow

output width

>=1

[] represents rounding down to the nearest integer.

BWD_D

sh

height-wise stride

>=1

[] represents rounding down to the nearest integer.

BWD_D

sw

width-wise stride

>=1

[] represents rounding down to the nearest integer.

BWD_D

dh

height-wise dilation

>=0

[] represents rounding down to the nearest integer.

BWD_D

dw

width-wise dilation

>=0

[] represents rounding down to the nearest integer.

BWD_D

ph

height padding

0<=ph<=(kh-1) x (dh+1)

[] represents rounding down to the nearest integer.

BWD_D

pw

width padding

0<=pw<=(kw-1) x (dw+1)

[] represents rounding down to the nearest integer.

BWD_D

DKH

kernel height with dilation

DKH = 1 + (kh-1) x (dh+1)

[] represents rounding down to the nearest integer.

BWD_D

DKW

kernel width with dilation

DKW = 1 + (kw-1) x (dw+1)

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

mb

batch

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

ic

input channel

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

ih

input height

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

iw

input width

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

oc

output channel

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

kh

kernel height

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

kw

kernel width

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

oh

output height

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

ow

output width

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

sh

height-wise stride

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

sw

width-wise stride

>=1

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

dh

height-wise dilation

>=0

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

dw

width-wise dilation

>=0

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

ph

height padding

0<=ph<=(kh-1) x (dh+1)

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

pw

width padding

0<=pw<=(kw-1) x (dw+1)

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

DKH

kernel height with dilation

DKH = 1 + (ih-1) x (dh+1)

[] represents rounding down to the nearest integer.

BWD_W, BWD_WB

DKW

kernel width with dilation

DKW = 1 + (iw-1) x (dw+1)

[] represents rounding down to the nearest integer.

Concat

Function Description

Function

Concatenates N tensors over the specified concat_dimension dimension (represented by C).

Formula

Wherein:

Table 1 Formula parameters

Parameter

Description

src

Input tensor

dst

Target tensor

ou

The outermost dimension

in

The innermost dimension

c

Dimensions to be concatenated

The Concat primitive does not distinguish between forward and backward propagation.

Feature Scope

Data Types

Table 1 Supported data type combinations (input and output data types being the same)

src1 Data Type

src2 Data Type

...

srcN Data Type

dst Data Type

f32

f32

f32

f32

f32

f16

f16

f16

f16

f16

bf16

bf16

bf16

bf16

bf16

s32

s32

s32

s32

s32

s8

s8

s8

s8

s8

u8

u8

u8

u8

u8

Data Layout

A maximum of 5 dimensions are supported. Input tensors must have the same number of dimensions, and the size of each dimension must be identical. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension

src1/…/srcn/dst

1D Tensor

a

2D Tensor

ab, ba

3D Tensor

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Parameter Constraints

Field

Description

Value Range

--dst

src data type

f32

f16

bf16

s32

s8

u8

--ddt

dst data type

f32

f16

bf16

s32

s8

u8

--stag

src data layout

a

ab

ba

abc

acb

bac

bca

cab

cba

abcd

abdc

acbd

acdb

adbc

adcb

bacd

bcda

cdab

cdba

dcab

abcde

abced

abdec

acbde

acdeb

adecb

bacde

bcdea

cdeab

cdeba

decab

--dtag

dst data layout

a

ab

ba

abc

acb

bac

bca

cab

cba

abcd

abdc

acbd

acdb

adbc

adcb

bacd

bcda

cdab

cdba

dcab

abcde

abced

abdec

acbde

acdeb

adecb

bacde

bcdea

cdeab

cdeba

decab

--axis

Concatenation direction

[0, dim_num - 1]

[problem dim]

src0 scale: src1 scale

N1xN11xN3xN4xN5: N1xN12xN3xN4xN5

It is required that the lengths of other dimensions be the same except that of the concatenation dimension.

Concat requires that the memory layout of input and output tensors be the same and the corresponding data types be the same.

Resampling

Function Description

Function

Performs resampling operations on the input tensor. This operator uses two interpolation algorithms: Nearest Neighbor and Linear.

Formula

  • The nearest neighbor interpolation algorithm is dst(n, c, oh, ow) = src(n, c, ih, iw), where:

    • ih=[(oh+0.5)/Fh−0.5]
    • iw=[(ow+0.5)/Fw−0.5]
  • The mathematical formula for bilinear sampling is dst(n, c, oh, ow) = src(n, c, ih0, iw0)*(1 - Wih)*(1 - Wiw) +src(n, c, ih1, iw0)*Wih*(1 - Wiw) + src(n, c, ih0, iw1) * (1 - Wih)*Wiw +src(n, c, ih1, iw1)*Wih*Wiw, where:

    • ih0=⌊oh+0.5Fh−0.5⌋
    • ih1=⌈oh+0.5Fh−0.5⌉
    • iw0=⌊ow+0.5Fw−0.5⌋
    • iw1=⌈ow+0.5Fw−0.5⌉
    • Wih=oh+0.5Fh−0.5−ih0
    • Wiw=ow+0.5Fw−0.5−iw0

    Table 1 Formula parameters

    Parameter

    Description

    src

    Input tensor

    dst

    Target tensor

    n

    Dimension 1 to be sampled

    c

    Dimension 2 to be sampled

    ih

    Input height

    iw

    Input width

    oh

    Output height

    ow

    Output width

Feature Scope

Data Types

FWD_D and BWD_D support arbitrary combination of the f32, f16, and bf16 data types.

Propagation Direction

src Data Type

dst Data Type

diff_dst Data Type

diff_src Data Type

FWD_D, BWD_D

f32

f32

f32

f32

FWD_D, BWD_D

f16

f16

f16

f16

FWD_D, BWD_D

bf16

bf16

bf16

bf16

Data Layout

Three to five dimensions are supported. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension

Tag (src/dst/diff_dst/diff_src)

3D Tensor

abc, acb

4D Tensor

abcd, acdb

5D Tensor

abcde, acdeb

Parameter Constraints

Field

Value

--dir

FWD_D [default], BWD_D

--sdt

f32 [default], f16, bf16, s32, s8, u8

--ddt

f32 [default], f16, bf16, s32, s8, u8

--alg

nearest [default], linear

--tag

axb [default], abx

Resampling requires that the memory layout of the input and output tensors be the same, but the dimensions can be different.

For details, see the following cases:

  • 5D: mb4_ic8_id4od8_ih4oh8_iw4ow8, 4×8×4×4×4 (input), 4×8×8×8×8 (output)
  • 4D: mb4_ic8_ih4oh8_iw4ow8, 4x8x4x4 (input), output 4x8x8x8 (output)
  • 3D: mb4_ic8_iw4ow8, 4x8x4 (input), 4x8x8 (output)

Note that if the value of id, ih, iw, od, oh, or ow is too large, the precision may be affected due to a limitation of the test system. This is not a functional problem because other platforms have the same issue. To avoid this issue, you are advised to set the preceding parameters to a value less than 20000.

Shuffle

Function Description

Function

Shuffles tensor data along a shuffle axis (dimension).

Formula

The formula is , where c' and c relate through the equations and . In the formula, .

Table 1 Formula parameters

Parameter

Description

src

Input tensor

dst

Target tensor

c

Shuffle axis in c dimension

The outermost indexes

The innermost indexes

Feature Scope

Data Types

Supported data types are as follows. The src and dst data types must be the same.

Propagation Direction

src Data Type

dst Data Type

FWD_D, BWD_D

f32

f32

FWD_D, BWD_D

f16

f16

FWD_D, BWD_D

bf16

bf16

FWD_D, BWD_D

s32

s32

FWD_D, BWD_D

s8

s8

FWD_D, BWD_D

u8

u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension

src Data Layout

dst Data Layout

1D Tensor

a

a

2D Tensor

ab, ba

ab, ba

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Parameter Constraints

Field

Description

Value Range

--dir

Propagation direction

FWD_D (default value)

BWD_D

--dt

src or dst data type

f32 (default value)

f16

bf16

s32

s8

u8

--tag

src or dst data layout

a

ab

ba

abc

acb

bac

bca

cab

cba

abcd

abdc

acbd

acdb

adbc

adcb

bacd

bcda

cdab

cdba

dcab

abcde

abced

abdec

acbde

acdeb

adecb

bacde

bcdea

cdeab

cdeba

decab

--axis

Dimension of the axis

[0, tensor dimension–1]

--group

Group size

Integer that is greater than or equal to 1 and can be exactly divided by the dimension of the axis

[shuffle_desc]

src or dst scale

N1xN2xN3…xN5

Reorder

Function Description

Function

Reorders tensors into arbitrary memory layout formats and data types.

Formula

The formula is as follows:

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

s32

s32

s8

s8

u8

u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension

src Data Layout

dst Data Layout

1D Tensor

a

a

2D Tensor

ab, ba

ab, ba

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Pool

Function Description

Function

Implements pooling operations (maximum and average) to reduce tensor dimensions while preserving key features.

Feature Scope

Data Types

Propagation Direction

src Data Type

dst Data Type

alg Type

forward/backward

f32

f32

max, avg_p, avg_np

forward/backward

f16

f16

max, avg_p, avg_np

forward/backward

bf16

bf16

max, avg_p, avg_np

forward

s8

s8

max, avg_p, avg_np

forward

u8

u8

max, avg_p, avg_np

forward

s32

s32

max, avg_p, avg_np

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension

src Data Layout

dst Data Layout

1D Tensor

a

a

2D Tensor

ab, ba

ab, ba

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Batch Normalization (bnormal)

Function Description

Function

Performs batch normalization on tensors.

Formula

The formula is .

For details, see parameter description in Layer Normalization.

Feature Scope

Data Types

Propagation Direction

src/dst Data Type

forward/backward

f32, bf16, f16

Data Layout

Three to five dimensions are supported. The following data layout formats are supported. The input and output tensors must use the same layout.

Tensor Dimension

src Data Layout

dst Data Layout

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Local Response Normalization (lrn)

Function Description

Function

Performs local response normalization.

Formula

The cross-channel formula is as follows:

The single-channel formula is as follows:

Table 1 Formula parameters

Parameter

Description

dst

Target tensor

src

Input tensor

k

Local constant

a

Response constant

Constant, used to improve numerical stability

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension

src Data Layout

dst Data Layout

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Reduction

Function Description

Function

Performs a specified algorithm operation on each target element in one or more dimensions of a tensor.

Formula

The formula is , where reduce_op includes the operations listed in Table 1 reduce_op algorithm operations.

Table 1 reduce_op algorithm operations

reduce_op

Function

max

Obtains the maximum value of a tensor along the reduction dimension.

min

Obtains the minimum value of a tensor along the reduction dimension.

sum

Obtains the sum of elements in a tensor along the reduction dimension.

mul

Obtains the product of elements in a tensor along the reduction dimension.

mean

Obtains the average value of elements in a tensor along the reduction dimension.

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension

src Data Layout

dst Data Layout

1D Tensor

a

a

2D Tensor

ab

ab

3D Tensor

abc

abc

4D Tensor

abcd

abcd

5D Tensor

abcde

abcde

Parameter Constraints

The dimension of the dst tensor being reduced must be 1. Examples:

  • A 5D src tensor (5×6×7×8×9) paired with a 5D dst tensor (1×1×1×1×1) indicates reduction across all dimensions (dimensions 1 through 5).
  • A 5D src tensor (5×6×7×8×9) paired with a 5D dst tensor (5×6×7×8×1) indicates reduction only along the innermost dimension (dimension 5).

PReLU

Function Description

Function

An improved version of the Rectified Linear Unit (ReLU) activation function, performs parameterized ReLU operation.

Formula

The forward formula is as follows:

The backward formula is as follows:

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

s32

s32

s8

s8

u8

u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension

src Data Layout

dst Data Layout

1D Tensor

a

a

2D Tensor

ab, ba

ab, ba

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Binary

Function Description

Function

Returns the element-wise operation results between tensors source0 and source1, with support for reordering to arbitrary layouts and conversion to arbitrary data types.

Formula

The formula is .

Table 1 Operator operation

reduce_op

Function

add

Addition

minus

Subtraction

multiply

Multiplication

div

Division

gt

Greater than

ge

Greater than or equal to

lt

Less than

le

Less than or equal to

ne

Not equal to

The binary operator does not distinguish between forward and backward propagation.

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

s32

s32

s8

s8

u8

u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension

src0/src1 Data Layout

dst Data Layout

1D Tensor

a

a

2D Tensor

ab, ba

ab, ba

3D Tensor

abc, acb, bac, bca, cab, cba

abc, acb, bac, bca, cab, cba

4D Tensor

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab

5D Tensor

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

RNN

Function Description

Function

Processes sequential or time-series data to train machine learning models that can generate sequential predictions or derive conclusions from sequence-based inputs.

Formula

The RNN formula is as follows:

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

bf16

bf16

Data Layout

The data layout is fixed.

Input Data

Dimension

Layout

src_layer

3

{time_step, batch, slc}

src_iter

4

{layer_num, dir, batch, sic}

weight_layer

5

{layer_num, dir, slc, gates, dhc}

weight_iter

5

{layer_num, dir, sic, gates, dic}

dst_layer

3

{time_step, batch, dhc}

dst_iter

4

{layer_num, dir, batch, dic}

Group Normalization

Function Description

Function

Performs group normalization by channel.

Formula

The formula is as follows:

where the shape of the input src is defined by (N, C, H, W), and G indicates the number of groups.

  • : scaling and shift coefficients
  • : mean and variance
Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type

dst Data Type

f32

f32

f16

f16

bf16

bf16

s8

s8

u8

u8

The data types of mean, variance, scale, and shift are independent of src and dst, and remains f32.

Data Layout

Three to five dimensions are supported. The following data layout formats are supported. The input and output tensors must use the same layout.

Tensor Dimension

src Data Layout

dst Data Layout

2D Tensor

ab

ab

3D Tensor

abc, acb

abc, acb

4D Tensor

abcd, acdb

abcd, acdb

5D Tensor

abcde, acdeb

abcde, acdeb

SparseGemm

Function Description

Function

Computes the product of a sparse matrix and a dense matrix. The operator is designed based on the compressed sparse row (CSR) storage structure. It skips zero blocks during loading and computing to maximize the efficiency of computational and memory bandwidth utilization. The core computing kernel is optimized for the Kunpeng platform based on SIMD.

Formula

The SparseGemm operator computes the following matrix multiplication:

where,

  • A: sparse matrix
  • B: dense matrix
  • C: output matrix
  • α and β: optional scaling coefficients
Feature Scope

Data Types

Table 1 Supported parameter data types

A Data Type

B Data Type

C Data Type

f32

f32

f32

Data Layout

Tensor Dimension

A Data Layout

B Data Layout

C Data Layout

2D

Layout::AB

Layout::AB

Layout::AB

KDNN_EXT Operators

Operator Description

KDNN_EXT is an extension module of KDNN. It has the following features:

  • Easy-to-use interfaces: The Cython framework is used to provide Python interfaces, making it more suitable for user scenarios.
  • High performance: The bottom-layer implementation is in the C language, providing high-performance interfaces.

The following operators are available:

  • random_choice
  • softmax

Operator Definition

softmax

Softmax is a common activation function used in multi-classification problems. It converts a set of arbitrary real numbers into a probability distribution whose output values range from 0 to 1, and the sum of all output values is 1.

The main features are as follows:

  • Normalized output: The softmax function normalizes the input to ensure that the output is a valid probability distribution. Even if the input is any real number, the output sum of the softmax function is still 1. It is commonly used at the output layer of multi-classification problems.
  • Non-linear: The softmax function is a non-linear function. It can perform non-linear transformation on the input to increase the representation capability of the model, thereby better fitting complex data patterns.
  • Translation invariance: The softmax function performs translation invariance on the input. That is, when each element in the input vector adds (or subtracts) the same constant, the softmax output is not changed.

In a neural network, the softmax function is usually used at the output layer to convert the original output of the neural network into a vector representing class probabilities. During training, the difference between the softmax output and the actual label can be used as a loss function. Through backward propagation, network parameters are updated to minimize the loss and improve model performance.

Interface Definition

def softmax(arr: np.ndarray)->np.ndarray

Receives a 1D or 2D NumPy array and returns the result of softmax calculation.

Input Parameters

Parameter

Type

Description

arr

ndarray

The elements are real numbers in FP32, and the dimension can be 1D or 2D.

Return Value

Type

Description

ndarray

The shape is the same as the input.

Example

>>> import numpy as np
>>> from libkdnn_ext import softmax
>>> x = np.random.rand(1, 5).astype(np.float32)
>>> softmax(x)
array([[0.19810137, 0.21171768, 0.16419397, 0.24222486, 0.1837621 ]], dtype=float32)

random_choice

random_choice is an algorithm used to randomly select elements from a set by probability. In computer science, random selection is a common operation used in scenarios such as random sampling, random arrangement, and Monte Carlo simulation.

The core of the random selection algorithm is to randomly select an element from a given set. For an input whose sum is 1, it randomly selects an element by probability and returns the index of the element.

Interface Definition

def random_choice(arr: np.ndarray, seed: int)->List[int]

Receives NumPy arrays and random seeds, and returns the result of random_choice calculation.

Input Parameters

Parameter

Type

Description

arr

ndarray

The data type is FP32, and the dimension can be 1D or 2D. When the dimension is 2D, the shape must be (1, N) or (N, 1).

seed

int

Random seed. If seed=-1, the random seed is generated by the system based on the current timestamp. If seed!=-1, the random seed is the input value.

Return Value

Type

Description

List[int]

The value is 1. If an exception occurs, [-1] is returned.

Example

>>> import numpy as np
>>> from libkdnn_ext import random_choice
>>> a = np.random.rand(1, 70336).astype(np.float32)
>>> a = np.abs(a)
>>> t = a.sum(axis=1)
>>> a = a / t
>>> random_choice(a, -1)
array([17630], dtype=int32)
>>> random_choice(a, 2)
array([49333], dtype=int32)

Obtaining Version Information

Obtains the KDNN_EXT product version information.

Interface Definition

def get_version() -> Dict[bytes, bytes]

Return Value

Type

Description

Dict[bytes, bytes]

The product version information is returned. If an exception occurs, {} is returned.

Example

>>> from libkdnn_ext import get_version
>>> get_version()
{'productName': b'Kunpeng Boostkit', 'productVersion': b'24.0.0', 'componentName': b'BoostKit-kail', 'componentVersion': b'1.0.0', 'componentAppendInfo': b'gcc', 'softwareName': b'boostKit-kail-dnn-ext', 'softwareVersion': b'1.0.0'}

The version number and compile time are subject to the running results in your environment. The preceding results are for reference only.