openEuler核隔离配置
openEuler核隔离介绍
在HPC场景下,由于每个线程会频繁进行同步,OS背景噪声对性能的影响会随着节点数增多逐步放大,OS背景噪声包括背景守护进程、外设中断、内核背景线程等。
增强核隔离特性将系统CPU分为housekeeping CPU和non-housekeeping CPU,将OS背景噪声集中在housekeeping CPU上,non-housekeeping CPU只运行业务计算任务,通过减少业务运行时背景噪声的干扰,提升业务性能。non-housekeeping CPU可以通过启动参数nohz_full和isolcpus指定,增加enhanced_isolcpus参数后可以进一步消除磁盘IO的噪声干扰。
核隔离配置步骤
- 修改grub启动项。
vim /boot/efi/EFI/openEuler/grub.cfg
找到关键字“/vmlinuz”所在行,在行末尾添加以下内容:
irqaffinity=37,75,113,151,189,227,265,303,341,379,417,455,493,531,569,607 nohz_full=0-36,38-74,76-112,114-150,152-188,190-226,228-264,266-302,304-340,342-378,380-416,418-454,456-492,494-530,532-568,570-606 isolcpus=nohz,domain,managed_irq,0-36,38-74,76-112,114-150,152-188,190-226,228-264,266-302,304-340,342-378,380-416,418-454,456-492,494-530,532-568,570-606 rcu_nocbs=0-36,38-74,76-112,114-150,152-188,190-226,228-264,266-302,304-340,342-378,380-416,418-454,456-492,494-530,532-568,570-606 disable_sdei_nmi_watchdog enhanced_isolcpus
当前鲲鹏920专业版上每个节点共2个CPU,每个CPU 8个NUMA,每个NUMA 38核,整机共608核,核隔离参数推荐将每个NUMA的最后1个核配置为housekeeping CPU,housekeeping CPU的优先级为“irqaffinity”,使内核进程/中断优先调度到housekeeping CPU上,减少系统对non-housekeeping CPU的影响。
同时,因配置核隔离后,若程序使用MPI顺序绑核会导致部分计算进程在housekeeping CPU上运行,影响程序整体性能。在该配置下,MPI可以指定rankfile,显式指定进程绑定的核。
- 重启节点,使核隔离参数生效。
- 重启后执行 cat /proc/cmdline,确认参数是否添加成功。
- 执行初始化脚本,将系统中断/服务绑定到housekeeping CPU,并进行性能配置。
#!/bin/bash # 禁用自动优化Linux系统硬件中断CPU分配的服务,强制绑定到housekeeping CPU systemctl stop irqbalance && systemctl mask irqbalance #设置实时任务可以使用全部CPU时间 echo -1 > /proc/sys/kernel/sched_rt_runtime_us #设置khugepaged 后台线程在合并页面时的碎片整理行为为不整理碎片 echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag #设置进程分配内存时的即时行为为从不整理碎片 echo never > /sys/kernel/mm/transparent_hugepage/defrag #设置透明大页内存分配策略为系统全局开启 echo always > /sys/kernel/mm/transparent_hugepage/enabled #禁用自动NUMA平衡 echo 0 > /proc/sys/kernel/numa_balancing ps -aux | grep rcu_sched | grep -v 'grep' | awk '{print $2}' > ~/.tmp cat ~/.tmp | while read line; do taskset -pc 37,75,113,151,189,227,265,303,341,379,417,455,493,531,569,607 $line; done ps -aux | grep kswapd | grep -v 'grep' | awk '{print $2}' > ~/.tmp cat ~/.tmp | while read line; do taskset -pc 37,75,113,151,189,227,265,303,341,379,417,455,493,531,569,607 $line; done ps -aux | grep kcompactd | grep -v 'grep' | awk '{print $2}' > ~/.tmp cat ~/.tmp | while read line; do taskset -pc 37,75,113,151,189,227,265,303,341,379,417,455,493,531,569,607 $line; done ps -aux | grep rcuog | grep -v 'grep' | awk '{print $2}' > ~/.tmp cat ~/.tmp | while read line; do taskset -pc 37,75,113,151,189,227,265,303,341,379,417,455,493,531,569,607 $line; done ps -aux | grep rcuos | grep -v 'grep' | awk '{print $2}' > ~/.tmp cat ~/.tmp | while read line; do taskset -pc 37,75,113,151,189,227,265,303,341,379,417,455,493,531,569,607 $line; done # 配置所有核的模式为performance for core_id in 0-607 do echo performance > /sys/devices/system/cpu/cpufreq/policy${core_id}/scaling_governor done
父主题: openEuler系统配置