API Reference

KDNN Operators

Operator Range

Table 1 Operators available in KDNN lists the operators available in KDNN.

Table 1 Operators available in KDNN

Operator	Description
Eltwise	Element-wise operator
Layer Normalization	Layer normalization operator
Inner Product	Matrix inner product operator
Softmax	Softmax normalization operator
Sum	Sum operator
Matmul	Matrix multiplication operator
Convolution	Convolution operator
Deconvolution	Deconvolution operator
Concat	Concatenation operator
Resampling	Data resampling operator
Shuffle	Data shuffling operator
Reorder	Data reordering operator
Pool	Pooling operator
Batch Normalization (bnormal)	Batch normalization operator
Local Response Normalization (lrn)	Local response normalization operator
Reduction	Reduction operator
PReLU	Activation operator (Leaky ReLU) with a trainable alpha parameter
Binary	Binary primitive computation tensor operator
RNN	Recurrent neural network operator
Group Normalization	Group normalization operator
SparseGemm	Sparse matrix multiplication operator

Operator Description

Eltwise

Function Description

Function

Performs operations of the same type on each element in a tensor, including abs, exp, and log.

Formula

Table 1 Operation types describes the operations supported by the Eltwise operator.

Table 1 Operation types

Operation	Propagation Direction	Formula
abs	Forward
exp	Forward
log	Forward
sqrt	Forward
round	Forward
tanh	Forward
relu	Forward
elu	Backward
logistic	Forward
logistic	Backward
linear	Forward
linear	Backward

Table 2 Formula parameters describes the meanings of the symbols in the preceding formulas.

Table 2 Formula parameters

Parameter	Description
s	Element in the src tensor
d	Element in the dst tensor
ds	Element in the diff_src tensor
dd	Element in the diff_dst tensor
α, β	Input parameters α and β of the constant floating-point type

Feature Scope

Propagation Directions and Data Types

Operation	Data Type	Supported Propagation Direction
abs	f32	Forward: dnnl_forward_training dnnl_forward_inference
exp	f32	Forward: dnnl_forward_training dnnl_forward_inference
log	f32	Forward: dnnl_forward_training dnnl_forward_inference
sqrt	f32	Forward: dnnl_forward_training dnnl_forward_inference
round	f32	Forward: dnnl_forward_training dnnl_forward_inference
tanh	f32	Forward: dnnl_forward_training dnnl_forward_inference
relu	f32/f16/bf16	Forward: dnnl_forward_training dnnl_forward_inference Backward: dnnl_backward_data
logistic	f32/f16/bf16	Forward: dnnl_forward_training dnnl_forward_inference Backward: dnnl_backward_data
linear	f32/f16/bf16	Forward: dnnl_forward_training dnnl_forward_inference Backward: dnnl_backward_data

Dimensions and Data Layout

The Eltwise operator in KDNN supports 1D to 5D and sequential data layout.

Table 1 Mapping between each tensor dimension and parameter data layout

Tensor Dimension	Input Tensor (src) Data Layout	Output Tensor (dst) Data Layout
1D	dnnl_a	dnnl_a
2D	dnnl_ab	dnnl_ab
3D	dnnl_abc	dnnl_abc
4D	dnnl_abcd	dnnl_abcd
5D	dnnl_abcde	dnnl_abcde

Layer Normalization

Function Description

Function

Performs layer normalization.

Formula

The formula of the layer normalization operator in the case of three dimensions is as follows:

The mean and variance can be calculated at run time or provided by the user. To calculate the two values at run time, use the following formulas:

Table 1 Formula parameters describes the parameters in the formulas.

Table 1 Formula parameters

Parameter	Description
	Scale
	Shift
	Mean
	Variance
	Constant, used to improve numerical stability

Feature Scope

Data Types

Table 1 Parameter data types

Input Data (src) Type	Output Data (dst) Type	Scale and Shift Data Type
f32	f32	f32
f16	f16	f16
bf16	bf16	bf16

Propagation Directions and Flags

flag	Propagation Direction
dnnl_normalization_flags_none (default normalization)	Forward: dnnl_forward_training dnnl_forward_inference Backward: dnnl_backward_data dnnl_backward
dnnl_use_global_stats (global statistics)	Forward: dnnl_forward_training dnnl_forward_inference Backward: dnnl_backward_data dnnl_backward
dnnl_use_scale (enabling the scale parameter)	Forward: dnnl_forward_training dnnl_forward_inference
dnnl_use_shift (enabling the shift parameter)	Forward: dnnl_forward_training dnnl_forward_inference
dnnl_use_global_stats \| dnnl_use_scale \| dnnl_use_shift	Forward: dnnl_forward_training dnnl_forward_inference

Data Layout

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension	Input Tensor (src) Data Layout	Output Tensor (dst) Data Layout	Tensor Data Layout for Mean and Variance
2D	dnnl_ab	dnnl_ab	dnnl_a
3D	dnnl_abc	dnnl_abc	dnnl_ab
4D	dnnl_abcd	dnnl_abcd	dnnl_abc
5D	dnnl_abcde	dnnl_abcde	dnnl_abcd

Inner Product

Function Description

Function

Calculates the matrix inner product.

Formula

In a 2D case, the formula for calculating the matrix inner product is as follows:

High-dimensional tensors are flattened into 2D tensors for calculation.

Table 1 Formula parameters

Number of

Parameter	Description
n	Batch number
ic	Number of input channels
oc	Number of output channels
src	Input tensor
weights	Weight tensor
bias	Bias tensor
dst	Output result tensor

Feature Scope

Propagation Directions

Propagation Direction	Specific Category
Forward	dnnl_forward_training dnnl_forward_inference
Backward	dnnl_backward_data dnnl_backward_weights

Propagation Direction

Specific Category

Forward

dnnl_forward_training

dnnl_forward_inference

Backward

dnnl_backward_data

dnnl_backward_weights

Data Types

Table 1 Parameter data types of the forward direction

Data type of the

Input Data (src) Type	Weight Data Type	Output Data (dst) Type	Bias Data Type
f32	f32	f32	f32/none
f16	f16	f16	f16/none
bf16	bf16	bf16	bf16/none
f16	f16	f32	f32/none
bf16	bf16	f32	f32/none

Table 2 Parameter data types of the backward direction (dnnl_backward_data category)

Input Data (src) Type	Weight Data Type	Output Data (dst) Type
f32	f32	f32
f16	f16	f16
bf16	bf16	bf16
f32	f16	f16
f32	bf16	bf16

Table 3 Parameter data types of the backward direction (dnnl_backward_weights category)

Input Data (src) Type	Weight Data Type	Output Data (dst) Type	Bias Data Type
f32	f32	f32	f32/none
f16	f16	f16	f16/none
bf16	bf16	bf16	bf16/none
f16	f32	f16	f32/none
bf16	f32	bf16	f32/none

Softmax

Function Description

Function

Performs the Softmax function operation along a data dimension.

Formula

Table 1 Formula parameters

Dimension along which the

Parameter	Description
src	Input tensor
dst	Output tensor
c	Dimension of the softmax operation
	The outermost dimension
	The innermost dimension
	Coefficient used to generate numerically stable output results. The coefficient is calculated using the following formula, where `ic` is all intermediate dimensions from the outermost to the innermost dimension.

Feature Scope

Propagation Directions

Propagation Direction	Specific Category
Forward	dnnl_forward_training dnnl_forward_inference
Backward	dnnl_backward_data

Propagation Direction

Specific Category

Forward

dnnl_forward_training

dnnl_forward_inference

Backward

dnnl_backward_data

Data Types

Table 1 Parameter data types

Input Data (src) Type	Output Data (dst) Type
f32	f32
f16	f16
bf16	bf16

Dimensions and Data Layout

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension	Input Tensor (src) Data Layout	Output Tensor (dst) Data Layout
1D	dnnl_a	dnnl_a
2D	dnnl_ab	dnnl_ab
3D	dnnl_abc	dnnl_abc
4D	dnnl_abcd	dnnl_abcd
5D	dnnl_abcde	dnnl_abcde

Sum

Function Description

Function

Calculates the sum of N tensors.

Formula

Table 1 Formula parameters

Parameter	Description
src	Input tensor
dst	Output tensor
scales	Scaling coefficient

Feature Scope

Data Types

Table 1 Parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16

Data Layout

The KDNN Sum operator supports the following data layout:

The data dimension can be 1D to 5D.
Each of the N input tensors must have the same dimensions and data layout as the output tensor.

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension	src Data Layout	dst Data Layout
1D	dnnl_a	dnnl_a
2D	dnnl_ab	dnnl_ab
3D	dnnl_abc	dnnl_abc
3D	dnnl_acb	dnnl_acb
4D	dnnl_abcd	dnnl_abcd
4D	dnnl_acdb	dnnl_acdb
5D	dnnl_abcde	dnnl_abcde
5D	dnnl_acdeb	dnnl_acdeb

Matmul

Function Description

Function

Performs matrix multiplication.

Formula

2D Tensor

High-dimensional tensor

Table 1 Formula parameters

Parameter	Description
src	Input tensor
weights	Weight tensor
bias	Bias tensor
dst	Output result tensor
m, n, k	Height and width dimensions `m`, `n`, and `k` for input matrices A(m,k) and B(k,n), and the output matrix C(m,n).

Feature Scope

Data Types

Table 1 Parameter data types

src Data Type	Weight Data Type	dst Data Type	Bias Data Type
f32	f32	f32	f32
f16	f16	f16	f16
bf16	bf16	bf16	bf16
f16	f16	f32	f32/f16
bf16	bf16	f32	bf16/f32

Data Layout

Table 2 Mapping between each tensor dimension and parameter data layout

Tensor Dimension	src Data Layout	Weight Data Layout	dst Data Layout
2D	dnnl_ab/dnnl_ba	dnnl_ab/dnnl_ba	dnnl_ab/dnnl_ba
3D	dnnl_abc/dnnl_acb	dnnl_abc/dnnl_acb	dnnl_abc/dnnl_acb
4D	dnnl_abcd/dnnl_abdc	dnnl_abcd/dnnl_abdc	dnnl_abcd/dnnl_abdc
5D	dnnl_abcde/dnnl_abced	dnnl_abcde/dnnl_abced	dnnl_abcde/dnnl_abced

Convolution

Function Description

Function

Performs convolution.

Formula

General 2D convolution calculation formula:

Table 1 Parameter description

Parameter	Depth	Height	Width	Comment
Padding: front, top, and left	PD_L	PH_L	PW_L	`padding_l` indicates the left-side padding of the corresponding vector.
Padding: back, bottom, and right	PD_R	PH_R	PW_R	`padding_r` indicates the right-side padding of the corresponding vector.
Stride	SD	SH	SW	Stride, which can be set to `1` for continuous convolution.
Dilation	DD	DH	DW	Dilation value, which can be set to `0` for non-dilated convolution.
src	-	-	-	Input tensor
weights	-	-	-	Weight tensor
bias	-	-	-	Bias tensor
dst	-	-	-	Output result tensor
ic	-	-	-	Input channel
oc	-	-	-	Output channel
oh	-	-	-	Output height
ow	-	-	-	Output width
kw	-	-	-	Width of the convolution kernel
kh	-	-	-	Height of the convolution kernel

Feature Scope

Propagation Directions and Data Types

Table 1 Parameter data types of the forward direction

Propagation Direction	src Data Type	Weight Data Type	dst Data Type	Bias Data Type
dnnl_forward_training dnnl_forward_inference	f32	f32	f32	f32
dnnl_forward_training dnnl_forward_inference	f16	f16	f16	f16
dnnl_forward_training dnnl_forward_inference	bf16	bf16	bf16	bf16
dnnl_forward_training dnnl_forward_inference	f16	f16	f32	f16
dnnl_forward_training dnnl_forward_inference	bf16	bf16	f32	bf16

Table 2 Parameter data types of the backward direction (dnnl_backward_data category)

Propagation Direction	src Data Type	Weight Data Type	dst Data Type	Bias Data Type
dnnl_backward_data	f32	f32	f32	f32
dnnl_backward_data	f16	f16	f16	f16
dnnl_backward_data	bf16	bf16	bf16	bf16
dnnl_backward_data	f32	f16	f16	f32
dnnl_backward_data	f32	bf16	bf16	f32

Table 3 Parameter data types of the backward direction (dnnl_backward_weights category)

Propagation Direction	src Data Type	Weight Data Type	dst Data Type	Bias Data Type
dnnl_backward_weights	f32	f32	f32	f32
dnnl_backward_weights	f16	f16	f16	f16
dnnl_backward_weights	bf16	bf16	bf16	bf16
dnnl_backward_weights	f16	f32	f16	f16
dnnl_backward_weights	bf16	f32	bf16	bf16

Data Layout

2D convolution is supported. The input and output tensor dimension is 4D. The layout of src, weights, and dst data needs to meet the following requirements:

Tensor Dimension	src Data Layout	Weight Data Layout	dst Data Layout
4D	dnnl_abcd	dnnl_abcd	dnnl_abcd

Parameter Constraints

Propagation Direction	Variable Name	Variable Description	Constraint	Remarks
dnnl_forward_training dnnl_forward_inference	mb	batch	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	ic	input channel	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	ih	input height	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	iw	input width	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	oc	output channel	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	kh	kernel height	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	kw	kernel width	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	oh	output height	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	ow	output width	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	sh	height-wise stride	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	sw	width-wise stride	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	dh	height-wise dilation	>=0	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	dw	width-wise dilation	>=0	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	ph	height padding	>=0	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	pw	width padding	>=0	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	DKH	kernel height with dilation	DKH = 1 + (kh-1) x (dh+1)	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_forward_training dnnl_forward_inference	DKW	kernel width with dilation	DKW = 1 + (kw-1) x (dw+1)	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	mb	batch	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	ic	input channel	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	ih	input height	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	iw	input width	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	oc	output channel	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	kh	kernel height	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	kw	kernel width	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	oh	output height	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	ow	output width	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	sh	height-wise stride	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	sw	width-wise stride	>=1	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	dh	height-wise dilation	>=0	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	dw	width-wise dilation	>=0	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	ph	height padding	0<=ph<=(kh-1) x (dh+1)	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	pw	width padding	0<=pw<=(kw-1) x (dw+1)	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	DKH	kernel height with dilation	DKH = 1 + (kh-1) x (dh+1)	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_data	DKW	kernel width with dilation	DKW = 1 + (kw-1) x (dw+1)	`oh` and `ow` can be left blank, automatically deduced by benchdnn, or set by the user. The following requirements must be met: [] represents rounding down to the nearest integer.
dnnl_backward_weights	mb	batch	>=1
dnnl_backward_weights	ic	input channel	>=1
dnnl_backward_weights	ih	input height	>=1
dnnl_backward_weights	iw	input width	>=1
dnnl_backward_weights	oc	output channel	>=1
dnnl_backward_weights	kh	kernel height	>=1
dnnl_backward_weights	kw	kernel width	>=1
dnnl_backward_weights	oh	output height	>=1
dnnl_backward_weights	ow	output width	>=1
dnnl_backward_weights	sh	height-wise stride	>=1
dnnl_backward_weights	sw	width-wise stride	>=1
dnnl_backward_weights	dh	height-wise dilation	>=0
dnnl_backward_weights	dw	width-wise dilation	>=0
dnnl_backward_weights	ph	height padding	>=0
dnnl_backward_weights	pw	width padding	>=0
dnnl_backward_weights	DKH	kernel height with dilation	DKH = 1 + (oh-1) x sh
dnnl_backward_weights	DKW	kernel width with dilation	DKW = 1 + (ow-1) x sw

Deconvolution

Function Description

Function

Performs deconvolution. Both forward and backward directions are supported.

Feature Scope

Data Types

Table 1 Parameter data types of the forward direction

Propagation Direction	src Data Type	Weight Data Type	dst Data Type	Bias Data Type
dnnl_forward_training dnnl_forward_inference	f32	f32	f32	f32
dnnl_forward_training dnnl_forward_inference	f16	f16	f16	f16
dnnl_forward_training dnnl_forward_inference	bf16	bf16	bf16	bf16
dnnl_forward_training dnnl_forward_inference	f16	f16	f32	f16
dnnl_forward_training dnnl_forward_inference	bf16	bf16	f32	bf16

Table 2 Parameter data types of the backward direction (dnnl_backward_data category)

Propagation Direction	src Data Type	Weight Data Type	dst Data Type	Bias Data Type
dnnl_backward_data	f32	f32	f32	f32
dnnl_backward_data	f16	f16	f16	f16
dnnl_backward_data	bf16	bf16	bf16	bf16
dnnl_backward_data	f32	f16	f16	f32
dnnl_backward_data	f32	bf16	bf16	f32

Table 3 Parameter data types of the backward direction (dnnl_backward_weights category)

Propagation Direction	src Data Type	Weight Data Type	dst Data Type	Bias Data Type
dnnl_backward_weights	f32	f32	f32	f32
dnnl_backward_weights	f16	f16	f16	f16
dnnl_backward_weights	bf16	bf16	bf16	bf16
dnnl_backward_weights	f16	f32	f16	f16
dnnl_backward_weights	bf16	f32	bf16	bf16

Data Layout

2D transposed convolution is supported. The input and output data are 4D tensors. The data layout of src, weights, and dst must meet the following requirements:

Tensor Dimension	src Data Layout	Weight Data Layout	dst Data Layout
4D Tensor	abcd	abcd	abcd

Parameter Constraints

Propagation Direction	Variable Name	Variable Description	Constraint	Remarks
FWD_B, FWD_D, FWD_I	mb	batch	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	ic	input channel	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	ih	input height	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	iw	input width	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	oc	output channel	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	kh	kernel height	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	kw	kernel width	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	oh	output height	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	ow	output width	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	sh	height-wise stride	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	sw	width-wise stride	>=1	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	dh	height-wise dilation	>=0	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	dw	width-wise dilation	>=0	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	ph	height padding	0<=ph<=(kh-1) x (dh+1)	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	pw	width padding	0<=pw<=(kw-1) x (dw+1)	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	DKH	kernel height with dilation	DKH = 1 + (kh-1) x (dh+1)	[] represents rounding down to the nearest integer.
FWD_B, FWD_D, FWD_I	DKW	kernel width with dilation	DKW = 1 + (kw-1) x (dw+1)	[] represents rounding down to the nearest integer.
BWD_D	mb	batch	>=1	[] represents rounding down to the nearest integer.
BWD_D	ic	input channel	>=1	[] represents rounding down to the nearest integer.
BWD_D	ih	input height	>=1	[] represents rounding down to the nearest integer.
BWD_D	iw	input width	>=1	[] represents rounding down to the nearest integer.
BWD_D	oc	output channel	>=1	[] represents rounding down to the nearest integer.
BWD_D	kh	kernel height	>=1	[] represents rounding down to the nearest integer.
BWD_D	kw	kernel width	>=1	[] represents rounding down to the nearest integer.
BWD_D	oh	output height	>=1	[] represents rounding down to the nearest integer.
BWD_D	ow	output width	>=1	[] represents rounding down to the nearest integer.
BWD_D	sh	height-wise stride	>=1	[] represents rounding down to the nearest integer.
BWD_D	sw	width-wise stride	>=1	[] represents rounding down to the nearest integer.
BWD_D	dh	height-wise dilation	>=0	[] represents rounding down to the nearest integer.
BWD_D	dw	width-wise dilation	>=0	[] represents rounding down to the nearest integer.
BWD_D	ph	height padding	0<=ph<=(kh-1) x (dh+1)	[] represents rounding down to the nearest integer.
BWD_D	pw	width padding	0<=pw<=(kw-1) x (dw+1)	[] represents rounding down to the nearest integer.
BWD_D	DKH	kernel height with dilation	DKH = 1 + (kh-1) x (dh+1)	[] represents rounding down to the nearest integer.
BWD_D	DKW	kernel width with dilation	DKW = 1 + (kw-1) x (dw+1)	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	mb	batch	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	ic	input channel	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	ih	input height	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	iw	input width	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	oc	output channel	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	kh	kernel height	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	kw	kernel width	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	oh	output height	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	ow	output width	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	sh	height-wise stride	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	sw	width-wise stride	>=1	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	dh	height-wise dilation	>=0	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	dw	width-wise dilation	>=0	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	ph	height padding	0<=ph<=(kh-1) x (dh+1)	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	pw	width padding	0<=pw<=(kw-1) x (dw+1)	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	DKH	kernel height with dilation	DKH = 1 + (ih-1) x (dh+1)	[] represents rounding down to the nearest integer.
BWD_W, BWD_WB	DKW	kernel width with dilation	DKW = 1 + (iw-1) x (dw+1)	[] represents rounding down to the nearest integer.

Concat

Function Description

Function

Concatenates N tensors over the specified concat_dimension dimension (represented by C).

Formula

Wherein:

Table 1 Formula parameters

Parameter	Description
src	Input tensor
dst	Target tensor
ou	The outermost dimension
in	The innermost dimension
c	Dimensions to be concatenated

The Concat primitive does not distinguish between forward and backward propagation.

Feature Scope

Data Types

Table 1 Supported data type combinations (input and output data types being the same)

src1 Data Type	src2 Data Type	...	srcN Data Type	dst Data Type
f32	f32	f32	f32	f32
f16	f16	f16	f16	f16
bf16	bf16	bf16	bf16	bf16
s32	s32	s32	s32	s32
s8	s8	s8	s8	s8
u8	u8	u8	u8	u8

Data Layout

A maximum of 5 dimensions are supported. Input tensors must have the same number of dimensions, and the size of each dimension must be identical. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension	src₁/…/src_n/dst
1D Tensor	a
2D Tensor	ab, ba
3D Tensor	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Parameter Constraints

Field	Description	Value Range
--dst	src data type	f32 f16 bf16 s32 s8 u8
--ddt	dst data type	f32 f16 bf16 s32 s8 u8
--stag	src data layout	a ab ba abc acb bac bca cab cba abcd abdc acbd acdb adbc adcb bacd bcda cdab cdba dcab abcde abced abdec acbde acdeb adecb bacde bcdea cdeab cdeba decab
--dtag	dst data layout	a ab ba abc acb bac bca cab cba abcd abdc acbd acdb adbc adcb bacd bcda cdab cdba dcab abcde abced abdec acbde acdeb adecb bacde bcdea cdeab cdeba decab
--axis	Concatenation direction	[0, dim_num - 1]
[problem dim]	src0 scale: src1 scale	N1xN11xN3xN4xN5: N1xN12xN3xN4xN5 It is required that the lengths of other dimensions be the same except that of the concatenation dimension.

Concat requires that the memory layout of input and output tensors be the same and the corresponding data types be the same.

Resampling

Function Description

Function

Performs resampling operations on the input tensor. This operator uses two interpolation algorithms: Nearest Neighbor and Linear.

Formula

The nearest neighbor interpolation algorithm is dst(n, c, oh, ow) = src(n, c, ih, iw), where:
- ih=[(oh+0.5)/F_h−0.5]
- iw=[(ow+0.5)/F_w−0.5]

The mathematical formula for bilinear sampling is dst(n, c, oh, ow) = src(n, c, ih₀, iw₀)*(1 - W_ih)*(1 - W_iw) +src(n, c, ih₁, iw₀)*W_ih*(1 - W_iw) + src(n, c, ih₀, iw₁) * (1 - W_ih)*W_iw +src(n, c, ih₁, iw₁)*W_ih*W_iw, where:

ih₀=⌊oh+0.5F_h−0.5⌋
ih₁=⌈oh+0.5F_h−0.5⌉
iw₀=⌊ow+0.5Fw−0.5⌋
iw₁=⌈ow+0.5Fw−0.5⌉
W_ih=oh+0.5F_h−0.5−ih₀
W_iw=ow+0.5F_w−0.5−iw₀

Table 1 Formula parameters

Parameter	Description
src	Input tensor
dst	Target tensor
n	Dimension 1 to be sampled
c	Dimension 2 to be sampled
ih	Input height
iw	Input width
oh	Output height
ow	Output width

Feature Scope

Data Types

FWD_D and BWD_D support arbitrary combination of the f32, f16, and bf16 data types.

Propagation Direction	src Data Type	dst Data Type	diff_dst Data Type	diff_src Data Type
FWD_D, BWD_D	f32	f32	f32	f32
FWD_D, BWD_D	f16	f16	f16	f16
FWD_D, BWD_D	bf16	bf16	bf16	bf16

Data Layout

Three to five dimensions are supported. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension	Tag (src/dst/diff_dst/diff_src)
3D Tensor	abc, acb
4D Tensor	abcd, acdb
5D Tensor	abcde, acdeb

Parameter Constraints

Field	Value
--dir	FWD_D [default], BWD_D
--sdt	f32 [default], f16, bf16, s32, s8, u8
--ddt	f32 [default], f16, bf16, s32, s8, u8
--alg	nearest [default], linear
--tag	axb [default], abx

Resampling requires that the memory layout of the input and output tensors be the same, but the dimensions can be different.

For details, see the following cases:

5D: mb4_ic8_id4od8_ih4oh8_iw4ow8, 4×8×4×4×4 (input), 4×8×8×8×8 (output)
4D: mb4_ic8_ih4oh8_iw4ow8, 4x8x4x4 (input), output 4x8x8x8 (output)
3D: mb4_ic8_iw4ow8, 4x8x4 (input), 4x8x8 (output)

Note that if the value of id, ih, iw, od, oh, or ow is too large, the precision may be affected due to a limitation of the test system. This is not a functional problem because other platforms have the same issue. To avoid this issue, you are advised to set the preceding parameters to a value less than 20000.

Shuffle

Function Description

Function

Shuffles tensor data along a shuffle axis (dimension).

Formula

The formula is , where c' and c relate through the equations and . In the formula, .

Table 1 Formula parameters

Parameter	Description
src	Input tensor
dst	Target tensor
c	Shuffle axis in c dimension
	The outermost indexes
	The innermost indexes

Feature Scope

Data Types

Supported data types are as follows. The src and dst data types must be the same.

Propagation Direction	src Data Type	dst Data Type
FWD_D, BWD_D	f32	f32
FWD_D, BWD_D	f16	f16
FWD_D, BWD_D	bf16	bf16
FWD_D, BWD_D	s32	s32
FWD_D, BWD_D	s8	s8
FWD_D, BWD_D	u8	u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension	src Data Layout	dst Data Layout
1D Tensor	a	a
2D Tensor	ab, ba	ab, ba
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Parameter Constraints

Field	Description	Value Range
--dir	Propagation direction	FWD_D (default value) BWD_D
--dt	src or dst data type	f32 (default value) f16 bf16 s32 s8 u8
--tag	src or dst data layout	a ab ba abc acb bac bca cab cba abcd abdc acbd acdb adbc adcb bacd bcda cdab cdba dcab abcde abced abdec acbde acdeb adecb bacde bcdea cdeab cdeba decab
--axis	Dimension of the axis	[0, tensor dimension–1]
--group	Group size	Integer that is greater than or equal to 1 and can be exactly divided by the dimension of the axis
[shuffle_desc]	src or dst scale	N1xN2xN3…xN5

Reorder

Function Description

Function

Reorders tensors into arbitrary memory layout formats and data types.

Formula

The formula is as follows:

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16
s32	s32
s8	s8
u8	u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension	src Data Layout	dst Data Layout
1D Tensor	a	a
2D Tensor	ab, ba	ab, ba
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Pool

Function Description

Function

Implements pooling operations (maximum and average) to reduce tensor dimensions while preserving key features.

Feature Scope

Data Types

Propagation Direction	src Data Type	dst Data Type	alg Type
forward/backward	f32	f32	max, avg_p, avg_np
forward/backward	f16	f16	max, avg_p, avg_np
forward/backward	bf16	bf16	max, avg_p, avg_np
forward	s8	s8	max, avg_p, avg_np
forward	u8	u8	max, avg_p, avg_np
forward	s32	s32	max, avg_p, avg_np

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension	src Data Layout	dst Data Layout
1D Tensor	a	a
2D Tensor	ab, ba	ab, ba
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Batch Normalization (bnormal)

Function Description

Function

Performs batch normalization on tensors.

Formula

The formula is .

For details, see parameter description in Layer Normalization.

Feature Scope

Data Types

Propagation Direction	src/dst Data Type
forward/backward	f32, bf16, f16

Data Layout

Three to five dimensions are supported. The following data layout formats are supported. The input and output tensors must use the same layout.

Tensor Dimension	src Data Layout	dst Data Layout
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Local Response Normalization (lrn)

Function Description

Function

Performs local response normalization.

Formula

The cross-channel formula is as follows:

The single-channel formula is as follows:

Table 1 Formula parameters

Parameter	Description
dst	Target tensor
src	Input tensor
k	Local constant
a	Response constant
-β	Constant, used to improve numerical stability

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension	src Data Layout	dst Data Layout
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Reduction

Function Description

Function

Performs a specified algorithm operation on each target element in one or more dimensions of a tensor.

Formula

The formula is , where reduce_op includes the operations listed in Table 1 reduce_op algorithm operations.

Table 1 reduce_op algorithm operations

reduce_op	Function
max	Obtains the maximum value of a tensor along the reduction dimension.
min	Obtains the minimum value of a tensor along the reduction dimension.
sum	Obtains the sum of elements in a tensor along the reduction dimension.
mul	Obtains the product of elements in a tensor along the reduction dimension.
mean	Obtains the average value of elements in a tensor along the reduction dimension.

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension	src Data Layout	dst Data Layout
1D Tensor	a	a
2D Tensor	ab	ab
3D Tensor	abc	abc
4D Tensor	abcd	abcd
5D Tensor	abcde	abcde

Parameter Constraints

The dimension of the dst tensor being reduced must be 1. Examples:

A 5D src tensor (5×6×7×8×9) paired with a 5D dst tensor (1×1×1×1×1) indicates reduction across all dimensions (dimensions 1 through 5).
A 5D src tensor (5×6×7×8×9) paired with a 5D dst tensor (5×6×7×8×1) indicates reduction only along the innermost dimension (dimension 5).

PReLU

Function Description

Function

An improved version of the Rectified Linear Unit (ReLU) activation function, performs parameterized ReLU operation.

Formula

The forward formula is as follows:

The backward formula is as follows:

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16
s32	s32
s8	s8
u8	u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported.

Tensor Dimension	src Data Layout	dst Data Layout
1D Tensor	a	a
2D Tensor	ab, ba	ab, ba
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

Binary

Function Description

Function

Returns the element-wise operation results between tensors source0 and source1, with support for reordering to arbitrary layouts and conversion to arbitrary data types.

Formula

The formula is .

Table 1 Operator operation

reduce_op	Function
add	Addition
minus	Subtraction
multiply	Multiplication
div	Division
gt	Greater than
ge	Greater than or equal to
lt	Less than
le	Less than or equal to
ne	Not equal to

The binary operator does not distinguish between forward and backward propagation.

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16
s32	s32
s8	s8
u8	u8

Data Layout

One to five dimensions are supported. The following data layout formats are supported, and the input and output tensors must use the same layout.

Tensor Dimension	src0/src1 Data Layout	dst Data Layout
1D Tensor	a	a
2D Tensor	ab, ba	ab, ba
3D Tensor	abc, acb, bac, bca, cab, cba	abc, acb, bac, bca, cab, cba
4D Tensor	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab	abcd, abdc, acbd, acdb, adbc, adcb, bacd, bcda, cdab, cdba, dcab
5D Tensor	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab	abcde, abced, abdec, acbde, acdeb, adecb, bacde, bcdea, cdeab, cdeba, decab

RNN

Function Description

Function

Processes sequential or time-series data to train machine learning models that can generate sequential predictions or derive conclusions from sequence-based inputs.

Formula

The RNN formula is as follows:

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
bf16	bf16

Data Layout

The data layout is fixed.

Input Data	Dimension	Layout
src_layer	3	{time_step, batch, slc}
src_iter	4	{layer_num, dir, batch, sic}
weight_layer	5	{layer_num, dir, slc, gates, dhc}
weight_iter	5	{layer_num, dir, sic, gates, dic}
dst_layer	3	{time_step, batch, dhc}
dst_iter	4	{layer_num, dir, batch, dic}

Group Normalization

Function Description

Function

Performs group normalization by channel.

Formula

The formula is as follows:

where the shape of the input src is defined by (N, C, H, W), and G indicates the number of groups.

: scaling and shift coefficients
: mean and variance

Feature Scope

Data Types

Table 1 Supported parameter data types

src Data Type	dst Data Type
f32	f32
f16	f16
bf16	bf16
s8	s8
u8	u8

The data types of mean, variance, scale, and shift are independent of src and dst, and remains f32.

Data Layout

Three to five dimensions are supported. The following data layout formats are supported. The input and output tensors must use the same layout.

Tensor Dimension	src Data Layout	dst Data Layout
2D Tensor	ab	ab
3D Tensor	abc, acb	abc, acb
4D Tensor	abcd, acdb	abcd, acdb
5D Tensor	abcde, acdeb	abcde, acdeb

SparseGemm

Function Description

Function

Computes the product of a sparse matrix and a dense matrix. The operator is designed based on the compressed sparse row (CSR) storage structure. It skips zero blocks during loading and computing to maximize the efficiency of computational and memory bandwidth utilization. The core computing kernel is optimized for the Kunpeng platform based on SIMD.

Formula

The SparseGemm operator computes the following matrix multiplication:

where,

A: sparse matrix
B: dense matrix
C: output matrix
α and β: optional scaling coefficients

Feature Scope

Data Types

Table 1 Supported parameter data types

A Data Type	B Data Type	C Data Type
f32	f32	f32

Data Layout

Tensor Dimension	A Data Layout	B Data Layout	C Data Layout
2D	Layout::AB	Layout::AB	Layout::AB

KDNN_EXT Operators

Operator Description

KDNN_EXT is an extension module of KDNN. It has the following features:

Easy-to-use interfaces: The Cython framework is used to provide Python interfaces, making it more suitable for user scenarios.
High performance: The bottom-layer implementation is in the C language, providing high-performance interfaces.

The following operators are available:

random_choice
softmax

Operator Definition

softmax

Softmax is a common activation function used in multi-classification problems. It converts a set of arbitrary real numbers into a probability distribution whose output values range from 0 to 1, and the sum of all output values is 1.

The main features are as follows:

Normalized output: The softmax function normalizes the input to ensure that the output is a valid probability distribution. Even if the input is any real number, the output sum of the softmax function is still 1. It is commonly used at the output layer of multi-classification problems.
Non-linear: The softmax function is a non-linear function. It can perform non-linear transformation on the input to increase the representation capability of the model, thereby better fitting complex data patterns.
Translation invariance: The softmax function performs translation invariance on the input. That is, when each element in the input vector adds (or subtracts) the same constant, the softmax output is not changed.

In a neural network, the softmax function is usually used at the output layer to convert the original output of the neural network into a vector representing class probabilities. During training, the difference between the softmax output and the actual label can be used as a loss function. Through backward propagation, network parameters are updated to minimize the loss and improve model performance.

Interface Definition

def softmax(arr: np.ndarray)->np.ndarray

Receives a 1D or 2D NumPy array and returns the result of softmax calculation.

Input Parameters

Parameter	Type	Description
arr	ndarray	The elements are real numbers in FP32, and the dimension can be 1D or 2D.

Return Value

Type	Description
ndarray	The shape is the same as the input.

Example

>>> import numpy as np
>>> from libkdnn_ext import softmax
>>> x = np.random.rand(1, 5).astype(np.float32)
>>> softmax(x)
array([[0.19810137, 0.21171768, 0.16419397, 0.24222486, 0.1837621 ]], dtype=float32)

random_choice

random_choice is an algorithm used to randomly select elements from a set by probability. In computer science, random selection is a common operation used in scenarios such as random sampling, random arrangement, and Monte Carlo simulation.

The core of the random selection algorithm is to randomly select an element from a given set. For an input whose sum is 1, it randomly selects an element by probability and returns the index of the element.

Interface Definition

def random_choice(arr: np.ndarray, seed: int)->List[int]

Receives NumPy arrays and random seeds, and returns the result of random_choice calculation.

Input Parameters

Parameter	Type	Description
arr	ndarray	The data type is FP32, and the dimension can be 1D or 2D. When the dimension is 2D, the shape must be (1, N) or (N, 1).
seed	int	Random seed. If `seed=-1`, the random seed is generated by the system based on the current timestamp. If `seed!=-1`, the random seed is the input value.

Return Value

Type	Description
List[int]	The value is `1`. If an exception occurs, [-1] is returned.

Example

>>> import numpy as np
>>> from libkdnn_ext import random_choice
>>> a = np.random.rand(1, 70336).astype(np.float32)
>>> a = np.abs(a)
>>> t = a.sum(axis=1)
>>> a = a / t
>>> random_choice(a, -1)
array([17630], dtype=int32)
>>> random_choice(a, 2)
array([49333], dtype=int32)

Obtaining Version Information

Obtains the KDNN_EXT product version information.

Interface Definition

def get_version() -> Dict[bytes, bytes]

Return Value

Type	Description
Dict[bytes, bytes]	The product version information is returned. If an exception occurs, `{}` is returned.

Example

>>> from libkdnn_ext import get_version
>>> get_version()
{'productName': b'Kunpeng Boostkit', 'productVersion': b'24.0.0', 'componentName': b'BoostKit-kail', 'componentVersion': b'1.0.0', 'componentAppendInfo': b'gcc', 'softwareName': b'boostKit-kail-dnn-ext', 'softwareVersion': b'1.0.0'}

The version number and compile time are subject to the running results in your environment. The preceding results are for reference only.