为调优MySQL的性能,可将计算节点的MySQL Pod限制在单个NUMA节点上运行。K8s提供了2种CPU分配策略:CPU Manager策略和Topology Manager策略。
绑定单NUMA限制:计算节点分别执行1到4。
解除单NUMA限制:同样在各个计算节点上执行1到4,但是2改为将配置文件内容修改为原先默认的内容。
- 确认K8s版本和实际绑核需求。
执行
kubectl version命令查看Kubernetes(K8s)版本信息
- 修改Kubelet配置文件。
- 打开配置文件
| vim /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
|
- 原文件默认内容如下:
1
2
3
4
5
6
7
8
9
10
11
12 | # Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
|
按
“i”进入编辑模式,修改后为:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 | # Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
# 修改1 增加两行ExecStartPre配置
ExecStartPre=/usr/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service
ExecStartPre=/usr/bin/mkdir -p /sys/fs/cgroup/hugetlb/system.slice/kubelet.service
ExecStart=
# 修改2 在ExecStart配置末尾增加--kube-reserved、--cpu-manager-policy、--feature-gates、--topology-manager-policy等参数
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS --kube-reserved=cpu=2,memory=250Mi --cpu-manager-policy=static --feature-gates=CPUManager=true --topology-manager-policy=single-numa-node
|
- CPU Manager策略:--feature-gates=CPUManager=true,保证yaml文件中mysql-1、mysql-2和mysql-3的limits限制的CPU核数连续分配。
- Topology Manager策略:--topology-manager-policy=single-numa-node,保证yaml文件中mysql-1、mysql-2和mysql-3的limits限制的CPU核数绑定在单NUMA,若需要CPU核数绑定在单NUMA,则yaml文件中limits限制的CPU核数必须小于等于单NUMA的CPU核数(可以执行lscpu或者numactl -H查看各个NUMA的CPU核数),否则在主节点K8s创建部署MySQL Pod后执行watch kubectl get pod -n ns-mysql-test -o wide查看会发现Pod创建失败。
- 若实际场景:创建的MySQL Pod的CPU核数大于单NUMA的CPU核数而且小于等于1P的CPU核数(本文中1P对应2个NUMA),而且要求NUMA节点不能跨P。则删除--topology-manager-policy=single-numa-node,并在主节点修改yaml文件删除limits资源限制,主节点执行K8s创建部署MySQL Pod后,在计算节点上通过taskset -pac手动把mysql进程以及线程绑核到0-47(NUMA node0和NUMA node1),具体绑核操作如下:
- 查看MySQL进程ID。
- 查看MySQL绑在哪些CPU核上。
- 把MySQL进程以及线程绑核到0-47(NUMA node0和NUMA node1)。
| taskset -pac 0-47 mysql进程ID
|
- 查看MySQL绑在哪些CPU核上。
- 按“Esc”键,输入:wq!,按“Enter”保存并退出编辑。
- 删除CPU管理状态文件cpu_manager_state。
| rm -f /var/lib/kubelet/cpu_manager_state
|
- 重启Kubelet服务。
| systemctl daemon-reload && systemctl restart kubelet
|
查看Kubelet状态。
- 修改mysql_deployment.yaml配置文件。
根据规划部署的node节点的CPU、内存实际情况选择合适的CPU、内存配置值,例如本文物理机上有4个NUMA,1P含有2个NUMA,每个NUMA含有24核CPU,CPU核数配置如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38 | ......
spec:
nodeSelector:
test: "mysql-test-1"
containers:
- name: mysql-1
image: mymysql/centos8-mysql-arm:8.0.19
resources:
limits:
cpu: 16
memory: 64Gi
......
---
......
spec:
nodeSelector:
test: "mysql-test-2"
containers:
- name: mysql-2
image: mymysql/centos8-mysql-arm:8.0.19
resources:
limits:
cpu: 16
memory: 64Gi
......
---
......
spec:
nodeSelector:
test: "mysql-test-3"
containers:
- name: mysql-3
image: mymysql/centos8-mysql-arm:8.0.19
resources:
limits:
cpu: 16
memory: 64Gi
......
|
要使Pod能生效single-numa-node模式的功能,必须要将Pod的resources中的CPU和memory显示配置出来,且resources中limits的配置要的requests的配置相等(即Guaranteed Pod),本文中省略了requests的配置,即会使request默认等于limits的配置。
- 在master节点上重新部署mysql_deployment.yaml。
| kubectl delete -f ./mysql_deployment.yaml
kubectl create -f ./mysql_deployment.yaml
|
- 查看Pod与NUMA的使用情况。
| docker ps -a | grep mysql
|
| bcc93653c574 48858e629fa6 "/entrypoint.sh mysq…" 31 minutes ago Up 31 minutes k8s_mysql-2_mysql-2_ns-mysql-test_605956f2-1e13-49c6-a197-6220915130bc_0
ea9afa2c2104 k8s.gcr.io/pause:3.2 "/pause" 32 minutes ago Up 32 minutes k8s_POD_mysql-2_ns-mysql-test_605956f2-1e13-49c6-a197-6220915130bc_0
|
| docker inspect bcc93653c574 | grep Cpuset
|
| "CpusetCpus": "2-17",
"CpusetMems": "",
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21 | Architecture: aarch64
Byte Order: Little Endian
CPU(s): 96
On-line CPU(s) list: 0-95
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 2
NUMA node(s): 4
Model: 0
CPU max MHz: 2600.0000
CPU min MHz: 200.0000
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 49152K
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop
|
可见mysql-1已被限制在NUMA node0上运行。