Feature Overview

Kunpeng BoostKit for Virtualization scheduling optimization feature accelerates CPU scheduling for applications on VMs based on software-hardware collaboration.

The CPU topology structure is directly passed to the VM through the NUMA awareness, cluster awareness, and virtualization scenario topology awareness features. The VM OS kernel utilizes the cluster task scheduling optimization to accelerate multi-thread/process calls.
The lock mechanism during preemption is optimized to improve VM performance in overcommitment scenarios.
The hardware deadlock mechanism is introduced to prevent VM suspensions and recovery failures caused by hardware deadlocks.

Benefits

The NUMA awareness feature improves the VM memory access performance by 5% to 15%.
The cluster awareness feature improves the performance of multi-thread applications (such as big data applications) on VMs by 2% to 20%.
The VM lock optimization feature improves Kunpeng CPU performance in overcommitment scenarios. When an 8-core VM has an overcommitment ratio of 1:2, the UnixBench score can be increased by about 40%.
The hardware deadlock mechanism prevents an Arm VM from recovery failures after hardware deadlocks.
The virtualization scenario topology awareness feature enhances the accuracy of the cache structure information in VMs and improves the performance of a single service for gaming.

Key Technologies

NUMA awareness: The NUMA topology is displayed on the VM to optimize the VM's memory access efficiency.
Cluster awareness: The support for cluster awareness scheduling is added to the OS scheduler. The OS introduces the software-core synergy to improve process scheduling performance. The cluster topology is displayed on the VM to support cluster awareness scheduling.
Lock optimization: When the VM OS is applying for a lock, the OS uses the shared memory to check whether the vCPU has been preempted by other VMs. If the vCPU has been preempted, the system exits lock wait. If the vCPU has not been preempted, the system enters lock wait.
Deadlock detection: Performance monitoring interrupt (PMI) is configured to non-maskable interrupt (NMI), and the SDEI watchdog is disabled to trigger a high-priority NMI in the VM. When hard lockups occur on a VM, the exception can be recorded and the VM can be reset.
Virtualization scenario topology awareness: The cache size is specified in the VM XML configuration of libvirt or the QEMU command for starting VMs to obtain more accurate VM cache structure information.

Application Scope

It is applicable in general Kunpeng virtualization scenarios and CPU overcommitment scenarios.

Parent topic: Virtualization Scheduling Optimization Feature Guide