NUMA Structure
The SMP centralized shared memory limits the memory access frequency of the processor. Therefore, the processor may be frequently in data access starvation. To better solve this problem, the NUMA architecture is developed.
In the NUMA architecture, multiple cores are bound into a node, and each node could be considered as a symmetric multiprocessor. Nodes of a CPU communicate with each other through the on-chip network, and different CPUs communicate with each other through Hydra interfaces with high bandwidth and low latency. In the NUMA architecture, the entire memory space is physically distributed, and a set of all these dual in-line memory modules (DIMMs) is the global memory of the entire system. The memory access time of each core depends on the location of the memory relative to the processor. The access to the local memory (on the local node) is faster. The Linux kernel supports the NUMA architecture since version 2.5. The current OSs also provide various tools and interfaces to help optimize and configure the nearest memory access. The Kunpeng processor supports the NUMA architecture. Therefore, a computer system implemented by using the Kunpeng processor can achieve good performance and resolve bus bottlenecks in the SMP architecture through performance tuning, providing stronger multi-core scalability and better and more flexible computing capability.