绑定单NUMA限制
为调优MySQL的性能,可将计算节点的MySQL Pod限制在单个
解除单NUMA限制:同样在各个计算节点上执行1到4,但是2改为将配置文件内容修改为原先默认的内容。
- 确认K8s版本和实际绑核需求。
- 修改Kubelet配置文件。
- 打开配置文件
1
vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
- 原文件默认内容如下:
1 2 3 4 5 6 7 8 9 10 11 12
# Note: This dropin only works with kubeadm and kubelet v1.11+ [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/sysconfig/kubelet ExecStart= ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
按“i”进入编辑模式,修改后为:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
# Note: This dropin only works with kubeadm and kubelet v1.11+ [Service] Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf" Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml" # This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env # This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use # the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file. EnvironmentFile=-/etc/sysconfig/kubelet # 修改1 增加两行ExecStartPre配置 ExecStartPre=/usr/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service ExecStartPre=/usr/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service ExecStart= # 修改2 在ExecStart配置末尾增加--kube-reserved、--cpu-manager-policy、--feature-gates、--topology-manager-policy等参数 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --kube-reserved=cpu=2,memory=250Mi --cpu-manager-policy=static --feature-gates=CPUManager=true --topology-manager-policy=single-numa-node
- CPU Manager策略:--feature-gates=CPUManager=true,保证YAML文件中mysql-1、mysql-2和mysql-3的limits限制的CPU核数连续分配。
- Topology Manager策略:--topology-manager-policy=single-numa-node,保证YAML文件中mysql-1、mysql-2和mysql-3的limits限制的CPU核数绑定在单NUMA,若需要CPU核数绑定在单NUMA,则YAML文件中limits限制的CPU核数必须小于等于单NUMA的CPU核数(可以执行lscpu或者numactl -H查看各个NUMA的CPU核数),否则在主节点K8s创建部署MySQL Pod后执行watch kubectl get pod -n ns-mysql-test -o wide查看会发现Pod创建失败。
- 若实际场景:创建的MySQL Pod的CPU核数大于单NUMA的CPU核数而且小于等于1P的CPU核数(本文中1P对应2个NUMA),而且要求NUMA节点不能跨P。则删除--topology-manager-policy=single-numa-node,并在主节点修改YAML文件删除limits资源限制,主节点执行K8s创建部署MySQL Pod后,在计算节点上通过taskset -pac手动把mysql进程以及线程绑核到0-47(NUMA node0和NUMA node1),具体绑核操作如下:
- 查看MySQL进程ID。
1
ps -ef | grep mysql
- 查看MySQL绑在哪些CPU核上。
1
taskset -pac mysql进程ID
- 把MySQL进程以及线程绑核到0-47(NUMA node0和NUMA node1)。
1
taskset -pac 0-47 mysql进程ID
- 查看MySQL绑在哪些CPU核上。
1
taskset -pac mysql进程ID
- 查看MySQL进程ID。
- 按“Esc”键,输入:wq!,按“Enter”保存并退出编辑。
- 打开配置文件
- 删除CPU管理状态文件cpu_manager_state。
1
rm -f /var/lib/kubelet/cpu_manager_state
- 重启Kubelet服务。
1
systemctl daemon-reload && systemctl restart kubelet
查看Kubelet状态。
1
systemctl status kubelet
- 修改mysql_deployment.yaml配置文件。
根据规划部署的node节点的CPU、内存实际情况选择合适的CPU、内存配置值,例如本文物理机上有4个NUMA,1P含有2个NUMA,每个NUMA含有24核CPU,CPU核数配置如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
...... spec: nodeSelector: test: "mysql-test-1" containers: - name: mysql-1 image: mymysql/centos8-mysql-arm:8.0.19 resources: limits: cpu: 16 memory: 64Gi ...... --- ...... spec: nodeSelector: test: "mysql-test-2" containers: - name: mysql-2 image: mymysql/centos8-mysql-arm:8.0.19 resources: limits: cpu: 16 memory: 64Gi ...... --- ...... spec: nodeSelector: test: "mysql-test-3" containers: - name: mysql-3 image: mymysql/centos8-mysql-arm:8.0.19 resources: limits: cpu: 16 memory: 64Gi ......
要使Pod能生效single-numa-node模式的功能,必须要将Pod的resources中的CPU和memory显示配置出来,且resources中limits的配置要的requests的配置相等(即Guaranteed Pod),本文中省略了requests的配置,即会使request默认等于limits的配置。
- 在master节点上重新部署mysql_deployment.yaml。
1 2
kubectl delete -f ./mysql_deployment.yaml kubectl create -f ./mysql_deployment.yaml
- 查看Pod与NUMA的使用情况。
1
docker ps -a | grep mysql
1 2
bcc93653c574 48858e629fa6 "/entrypoint.sh mysq…" 31 minutes ago Up 31 minutes k8s_mysql-2_mysql-2_ns-mysql-test_605956f2-1e13-49c6-a197-6220915130bc_0 ea9afa2c2104 k8s.gcr.io/pause:3.2 "/pause" 32 minutes ago Up 32 minutes k8s_POD_mysql-2_ns-mysql-test_605956f2-1e13-49c6-a197-6220915130bc_0
1
docker inspect bcc93653c574 | grep Cpuset
1 2
"CpusetCpus": "2-17", "CpusetMems": "",
1
lscpu
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Architecture: aarch64 Byte Order: Little Endian CPU(s): 96 On-line CPU(s) list: 0-95 Thread(s) per core: 1 Core(s) per socket: 48 Socket(s): 2 NUMA node(s): 4 Model: 0 CPU max MHz: 2600.0000 CPU min MHz: 200.0000 BogoMIPS: 200.00 L1d cache: 64K L1i cache: 64K L2 cache: 512K L3 cache: 49152K NUMA node0 CPU(s): 0-23 NUMA node1 CPU(s): 24-47 NUMA node2 CPU(s): 48-71 NUMA node3 CPU(s): 72-95 Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop
可见mysql-1已被限制在NUMA node0上运行。
父主题: K8s MySQL MGR维护