开发者
A-Tune智能调优引擎入门(实践)
A-Tune智能调优引擎入门(实践)
发表于2025/05/16
3620

A-Tune智能调优引擎入门(实践)

安装和启动A-Tune

实验环境

我们以华为云ECS openEuler 22.03 64bit with ARM为实验演示环境,由于是在虚拟机上进行实验,安全要求不是很严格,为方便起见,我们以root用户登录进行演示。以下是实验环境的一些展示:

uname -m # aarch64
uname -r # 5.10.0-60.139.0.166.oe2203.aarch64
cat /etc/os-release # openEuler 22.03 LTS

ip addr # eth0
fdisk -l | grep dev # /dev/vda

安装

请参照A-Tune代码仓的README文件进行安装,下文演示了从A-Tune仓库源码安装的一般过程。

首先安装开发工具和依赖:

yum group install -y "Development Tools"
yum install -y golang-bin python3 perf sysstat hwloc-gui lshw
yum install -y python3-dict2xml python3-flask-restful python3-pandas python3-scikit-optimize python3-xgboost python3-pyyaml

🔔注意

如果在安装的过程中出现“Error: GPG check FAILED”错误则可在yum install命令后增加--nogpgcheck选项,例如:

yum group install -y "Development Tools" --nogpgcheck

然后下载源代码并进行编译、安装:

git clone https://gitee.com/openeuler/A-Tune.git
cd A-Tune

make

make collector-install
make install

启动

像这样通过编译A-Tune源代码安装了atuned服务,网卡和磁盘已经自动更新为当前机器中的默认设备:

cat /etc/atuned/atuned.cnf | grep '^network' # network = eth0
cat /etc/atuned/atuned.cnf | grep '^disk' # disk = vda

现在我们可以加载并启动atuned和atune-engine服务:

systemctl daemon-reload
systemctl start atuned
systemctl start atune-rest
systemctl start atune-engine

查看atuned或atune-engine服务状态:

systemctl status atuned
systemctl status atune-rest
systemctl status atune-engine

如果显示状态为“active”则表示启动成功。
注意:有时需要按“q”或“Q”键从查看信息的状态中退出。

运行atune-adm命令

🎵 查看atune-adm的版本信息:

atune-adm --version # atune-adm version 1.2.0(f6b82c9)

🎵 查询系统当前支持的profile,以及profile所处的状态:

atune-adm list

显示的结果示例如下:

Support profiles:
......
+---------------------------------------------+-----------+
| database-postgresql-2p-sysbench-hdd         | false     |
+---------------------------------------------+-----------+
| database-postgresql-2p-sysbench-ssd         | false     |
+---------------------------------------------+-----------+
| default-default                             | false     |
+---------------------------------------------+-----------+
......

🎵 实时采集系统信息进行负载类型的识别,但不进行自动优化(这是因为带了--characterization参数的缘故):

atune-adm analysis --characterization

该命令执行过程如下所示:

1. Analysis system runtime information: CPU Memory IO and Network...
 ......
 0.2 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.01 0.0 0.2 0.0 0.0 0.0 0.0 0.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 0.0 0.0 0.0 593.0 205.0 0.0 0.0 0.0 0.2 204.4 0.0 210.0 0.06 0.01 0.0

 2. Current System Workload Characterization is default

这里识别出的当前负载类型为default,由于系统当前并没有什么运行特定任务,所以这是符合实际情况的。

🎵 激活由list命令列出来的某一profile:

atune-adm profile default-default

如果再用atune-adm list命令查看,会有如下显示:

......
+---------------------------------------------+-----------+
| default-default                             | true      |
+---------------------------------------------+-----------+
......

🎵 回滚对profile的激活操作:

atune-adm rollback

再用atune-adm list命令查看,会有如下显示:

......
+---------------------------------------------+-----------+
| default-default                             | false     |
+---------------------------------------------+-----------+
......

以上显示结果表明回滚成功。

🎵 实时采集系统信息进行负载类型的识别并进行自动优化:

atune-adm analysis

在业务上来说,这实际上是一个在线静态调优的简单演示。该示例输出如下:

1. Analysis system runtime information: CPU Memory IO and Network...
 ......
 0.3 0.0 0.1 0.0 0.1 0.0 0.0 0.0 0.5 0.0 0.0 0.2 0.0 0.0 0.0 50.0 0.0 8.0 0.0 1.0 0.0 0.0 0.01 0.0 0.2 0.0 0.0 0.0 0.0 0.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 0.0 0.0 2.0 593.0 205.0 0.0 0.0 0.0 0.6 204.6 0.0 210.0 0.02 0.01 0.0

 2. Current System Workload Characterization is default

 3. Build the best resource model...

 4. Match profile: default-default

 5. begin to set static profile
 [ SUGGEST] Bios                                     please change the BIOS configuration NUMA to Enable
 [ SUGGEST] Bootloader                               Need reboot to make the config change of grub2 effect.
 [ SUCCESS] memory                                   memory num is 1, the memory slot is full
 [ SUCCESS] Kernel                                   CONFIG_NUMA_AWARE_SPINLOCKS
 Completed optimization, please restart application!

如果用atune-adm list | grep true命令查看,会有如下显示:

| default-default                             | true      |

以上显示结果表明匹配到的“default-default”profile已经被激活(我们同样可以用atune-adm rollback命令回滚回去)。

🎵 检查系统当前信息

atune-adm check

该命令检查系统当前的CPU、BIOS、OS、网卡等信息,在本例中显示结果如下所示:

cpu information:
     cpu   version: 1.0  speed: 2600000000 HZ   cores: 2
 system information:
     DMIBIOSVersion: 0.0.0
     OSRelease: 5.10.0-60.139.0.166.oe2203.aarch64
 network information:
     name: eth0              product: Virtio network device
 
......

🎵 获取帮助:

atune-adm help

离线业务自调优示例

从业务上说,这实际上是一个离线动态调优的演示。本实验通过A-Tune的离线自调优功能选出最优的压缩算法及其配置。请参照A-Tune源代码仓compress案例的最新示例进行实验,下面仅作简要说明。

🎶步骤 1:下载压缩文件样本

先进入到A-Tune代码仓源代码的compress案例目录:

cd ./examples/tuning/compress

然后下载一个名为enwik8.zip的压缩文件到该目录:

wget https://www.mattmahoney.net/dc/enwik8.zip

该文件作为一个样本压缩了大量的文本内容。

🎶步骤 2:调优前的参数配置

cp compress.py{,.before}
sh prepare.sh enwik8.zip

此处运行一个prepare.sh脚本,该脚本解压缩enwik8.zip文件并进行一些参数设置,例如compress_client.yaml文件中“time”的权重为20,“compress_ratio”的权重为80,表明本次优化目标偏重压缩率。
🔔注意

  • 该脚本简化了繁杂的人工配置,具体的操作步骤请打开脚本的源代码进行学习。
  • 我们同样需要细致研究另一个脚本compress.py,由于这个文件和调优结果的保存有关,所以我们一开始备份了该文件以便稍后对比。

🎶步骤 3:进行tuning以找到最优配置

atune-adm tuning --project compress --detail compress_client.yaml

可能的输出如下:

Start to benchmark baseline...
 1.Loading its corresponding tuning project: compress
 2.Start to tuning the system......
 Current Tuning Progress......(1/20)
 Used time: 7s, Total Time: 7s, Best Performance: (time=1.62,compress_ratio=2.36), Performance Improvement Rate: 33.66%
 The 1th recommand parameters is: compressLevel=1,compressMethod=gzip
 The 1th evaluation value: (time=1.62,compress_ratio=2.36)(33.66%)
 ......
 Current Tuning Progress......(20/20)
 Used time: 1m17s, Total Time: 1m17s, Best Performance: (time=1.62,compress_ratio=2.36), Performance Improvement Rate: 33.66%
 The 20th recommand parameters is: compressLevel=1,compressMethod=gzip
 The 20th evaluation value: (time=1.63,compress_ratio=2.36)(33.25%)

 The final optimization result is: compressLevel=1,compressMethod=gzip
 The final evaluation value is: time=1.62,compress_ratio=2.36

 Baseline Performance is: (time=5.39,compress_ratio=2.74)

 Tuning Finished

此命令开启针对本压缩应用的自动化调优,调优结果反映在了参数compressLevel和compressMethod的设置上,对比调优前后的compress.py文件:

diff compress.py{,.before}

可能的输出如下(对比结果可能会因为各自具体的运行环境而有所不同):

......
12,13c12,13
< COMPRESS_LEVEL = 1
< COMPRESS_METHOD = "gzip"
---
> COMPRESS_LEVEL = 6
> COMPRESS_METHOD = "zlib"

🎶步骤 4:还原系统配置

atune-adm tuning --restore --project compress

通过此命令恢复到了本次tuning调优前的配置,即compressLevel和compressMethod的设置。

收藏举报
Level 1
0
帖子
0
粉丝
0
获赞