Introduction

This document describes how to deploy and use the Kubernetes non-uniform memory access (NUMA) affinity scheduling plugin on servers running the openEuler OS.

To use the NUMA affinity feature in Kubernetes, the CPU policy needs to be set to static mode, and the topology manager policy needs to be set to strict (or single-numa-node) mode on the compute node. Under this configuration, the setting of the pod (container basic management unit in Kubernetes) CPU must be the guarantee type, that is, the values of request and limit are the same (request=limit). During deployment, Kubernetes allocates and binds an integer number of logical cores to a pod. During pod running, this group of CPU cores is exclusively occupied by the pod and cannot be shared with other pods.

However, most cloud service providers need to deploy a large number of containers on a server, and the number of deployed containers even exceeds the number of physical cores of the server. In this scenario, the static mode CPU policy cannot be used, and CPU affinity cannot be ensured during container running. To address this issue, in the Kubernetes overcommitment scenario, the NUMA adaptation feature implements a Kubernetes NUMA affinity scheduling plugin based on the Node Resource Interface (NRI) of Containerd. The plugin is open-sourced and released on Gitee. This plugin automatically adjusts the CPU scheduling range of the pod based on the compute node's CPU load during pod deployment, thereby ensuring NUMA affinity.

Parent topic: Description