Rate This Document
Findability
Accuracy
Completeness
Readability

Feature Description

Introduction

NUMA awareness is an optimization policy utilizing the non-uniform memory access (NUMA) architecture to improve the performance of a multiprocessor system.

In the NUMA architecture, the processor and memory are divided into multiple nodes, for example, node 0 and node 1 in the following figure. Each node contains the local memory and the corresponding processors. For a processor (for example, a CPU in the node 0), accessing a memory (a local memory) in a same node is faster than accessing a memory (a remote memory) in another node. The core principle of NUMA awareness is to detect the NUMA topology to reduce cross-node memory access, reduce latency, and improve performance.

Figure 1 NUMA architecture

After Guest NUMA is configured for a VM, the VM can identify the vCPU NUMA status so that the VM can optimize memory resource usage based on the Guest NUMA topology. For example, the host CPU and memory used by VM 7 in the following figure are distributed on two NUMA nodes. Without NUMA awareness, the VM may experience a large number of cross-NUMA node memory accesses, affecting the performance. When Guest NUMA is configured, the NUMA topology of the host is transferred to the VM, reducing cross-NUMA memory access.

Version Requirements

  • Versions: openEuler 20.03 LTS SP1 or later, and QEMU 2.6.0 or later
  • License: none

Constraints

The NUMA awareness feature is used to detect the NUMA architecture on the host in the VM. The impact on performance depends on application characteristics. If the service software cannot identify or optimize NUMA, cross-NUMA memory access occurs and the performance deteriorates.

Application Scenarios

Apply to the 1:1 core binding scenario. The optimal NUMA topology is displayed based on the topology of vCPUs bound to the physical CPUs.

Principles

NUMA awareness feature configuration implements memory block binding and vCPU binding so that VMs can detect the NUMA architecture of the host machine and optimize performance.