鲲鹏社区首页
中文
注册
当互联网遇上AI存储革命:3FS+鲲鹏,如何打造高性能数据引擎? ——记3FS存储引擎在华为鲲鹏平台的适配与实践

当互联网遇上AI存储革命:3FS+鲲鹏,如何打造高性能数据引擎? ——记3FS存储引擎在华为鲲鹏平台的适配与实践

分布式存储

发表于 2025/05/06

0

作者:彭润霖 包旻晨 


一、部署环境

1.1 节点信息

Node

IP

Memory

NVMe SSD

RDMA

OS

meta

192.168.65.10

256GB

-

100GE ConnectX-6

OpenEuler 22.03 SP4

fuseclient1

192.168.65.11

256GB

-

100GE ConnectX-6

OpenEuler 22.03 SP4

fuseclient2

192.168.65.12

256GB

-

100GE ConnectX-6

OpenEuler 22.03 SP4

fuseclient3

192.168.65.13

256GB

-

100GE ConnectX-6

OpenEuler 22.03 SP4

storage1

192.168.65.14

1TB

3TB*8

100GE ConnectX-6

OpenEuler 22.03 SP4

storage2

192.168.65.15

1TB

3TB*8

100GE ConnectX-6

OpenEuler 22.03 SP4

storage3

192.168.65.16

1TB

3TB*8

100GE ConnectX-6

OpenEuler 22.03 SP4

1.2 服务信息

Service

Binary

Config files

NodeID

Node

monitor

monitor_collector_main

monitor_collector_main.toml

-

meta

admin_cli

admin_cli

admin_cli.toml fdb.cluster

-

meta storage1 storage2 storage3

mgmtd

mgmtd_main

mgmtd_main_launcher.toml mgmtd_main.toml mgmtd_main_app.toml fdb.cluster

1

meta

meta

meta_main

meta_main_launcher.toml meta_main.toml meta_main_app.toml fdb.cluster

100

meta

storage

storage_main

storage_main_launcher.toml storage_main.toml storage_main_app.toml

10001~10003

storage1 storage2 storage3

client

hf3fs_fuse_main

hf3fs_fuse_main_launcher.toml hf3fs_fuse_main.toml

-

meta

二、环境准备

2.1 关闭防火墙与设置seliunx模式

**注意:在所有节点进行操作**

关闭防火墙

systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

设置selinux为permissive模式

vi /etc/selinux/config

SELINUX=permissive

2.2 设置Hosts

**注意:在所有节点进行操作**

将 meta 对应的 ip 填入每个节点的 /etc/hosts

vi /etc/hosts
192.168.65.10 meta

2.3 集群时间同步

**注意:在所有节点进行操作**

yum install -y ntp ntpdate

在所有节点备份旧配置

cd /etc && mv ntp.conf ntp.conf.bak

**注意:在META节点进行操作**

vi /etc/ntp.conf
restrict 127.0.0.1
restrict ::1
restrict 192.168.65.10 mask 255.255.255.0
server 127.127.1.0
fudge 127.127.1.0 stratum 8

# Hosts on local network are less restricted.
restrict 192.168.65.10 mask 255.255.255.0 nomodify notrap

开启ntpd服务

systemctl start ntpd
systemctl enable ntpd
systemctl status ntpd


**注意:在除META外的节点进行操作**

vi /etc/ntp.conf
server 192.168.65.10

在除 meta的所有节点强制同步server时间。这里需要等待5min,否则会报错(no server suitable for synchronization found)

ntpdate meta

在除 meta的所有节点写入硬件时钟,避免重启后失效

hwclock -w

安装并启动crontab工具。

yum install -y crontabs
chkconfig crond on
systemctl start crond
crontab -e

每隔10分钟与meta节点同步一次时间

*/10 * * * * /usr/sbin/ntpdate 192.168.65.10

三、编译3FS

3.1 构建编译环境

 这里我们选择meta节点作为编译节点

① 安装编译所需的依赖

**注意:在所有节点进行操作**

# for openEuler
yum install cmake libuv-devel lz4-devel xz-devel double-conversion-devel libdwarf libdwarf-devel libunwind libunwind-devel libaio-devel gflags-devel glog glog-devel gtest-devel gmock-devel gperftools-devel gperftools openssl-devel gcc gcc-c++ boost* libatomic autoconf libevent-devel libibverbs libibverbs-devel cargo numactl-devel lld gperftools-devel gperftools double-conversion libibverbs rsync gperftools-libs glibc-devel python3-devel meson vim jemalloc -y

② 安装Rust

**注意:在META节点进行操作**

前往Rust官网下载aarch64架构对应的压缩包,并上传到服务器进行解压。

如果你的服务器可以直接连接Rust官网,可以通过wget命令获取对应的压缩包。
wget https://static.rust-lang.org/dist/rust-1.85.0-aarch64-unknown-linux-gnu.tar.xz
tar -xvf rust-1.85.0-aarch64-unknown-linux-gnu.tar.xz
cd rust-1.85.0-aarch64-unknown-linux-gnu
sh install.sh

③ 安装foundationdb

**注意:在META节点进行操作**
前往foundationdb/Github代码仓获取foundationdb安装包,并上传到服务器进行安装。

如果你的服务器可以直接链接GitHub,可以通过wget或者yum命令安装对应的安装包。
# 通过wget下载
wget https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-clients-7.3.63-1.el9.aarch64.rpm
wget https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-server-7.3.63-1.el9.aarch64.rpm
rpm -ivh foundationdb-clients-7.3.63-1.el9.aarch64.rpm
rpm -ivh foundationdb-server-7.3.63-1.el9.aarch64.rpm
# 或者通过yum下载
yum install https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-clients-7.3.63-1.el9.aarch64.rpm -y
yum install https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-server-7.3.63-1.el9.aarch64.rpm -y

④ 安装毕昇编译器

**注意:在META节点进行操作**
1. 可以前往毕昇编译器官网下载4.2.0版本编译器压缩包,上传到服务器进行解压。

或者通过wget命令直接获取对应的压缩包:
cd /home
wget https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/BiShengCompiler-4.2.0-aarch64-linux.tar.gz
tar -xvf BiShengCompiler-4.2.0-aarch64-linux.tar.gz
2. 临时配置环境变量:
export PATH=/home/BiShengCompiler-4.2.0-aarch64-linux/bin:$PATH
3. 查看编译器版本:
clang -v
BiSheng Enterprise 4.2.0.B009 clang version 17.0.6 (958fd14d28f0)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/BiShengCompiler-4.2.0-aarch64-linux/bin
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/10.3.1
Selected GCC installation: /usr/lib/gcc/aarch64-linux-gnu/10.3.1
Candidate multilib: .;@m64
Selected multilib: .;@m64

⑤安装libfuse

**注意:在META节点和Client节点进行操作**
Release libfuse 3.16.1 · libfuse/libfuse · GitHub
# 下载包
cd /home
wget https://github.com/libfuse/libfuse/releases/download/fuse-3.16.1/fuse-3.16.1.tar.gz
# 解压包
tar vzxf fuse-3.16.1.tar.gz
cd fuse-3.16.1
mkdir build
cd build
yum install -y meson
meson setup ..
ninja
ninja install

3.2 获取源码及编译

**注意:在META节点进行操作**
export LD_LIBRARY_PATH=/usr/local/lib64:/usr/local/lib:/usr/local:/usr/lib:/usr/lib64:$LD_LIBRARY_PATH
yum install git -y
cd /home
git clone https://gitee.com/kunpeng_compute/3FS.git
cd 3FS
git checkout origin/openeuler
git submodule update --init --recursive
./patches/apply.sh
cmake -S . -B build -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
cmake --build build -j
检查编辑结果
cd build/bin
ll

四、部署3FS

4.1 Meta节点

① 安装ClickHouse

**注意:在META节点进行操作**
1. 获取安装包并安装
cd /home
curl -k https://clickhouse.com/ | sh
sudo ./clickhouse install
安装完成时会让我们输入密码,假设我们输入的密码为clickhouse123
2. 修改ClickHouse默认端口:
chmod 660 /etc/clickhouse-server/config.xml
vim /etc/clickhouse-server/config.xml
定位到<tcp_port>标签 将端口号修改为9123:
...
<tcp_port>9123</tcp_port>
...
3. 启动clickhouse
clickhouse start
4. 创建Metric table
clickhouse-client --port 9123 --password 'clickhouse123' -n < /home/3FS/deploy/sql/3fs-monitor.sql

② 更新FoundationDB配置

**注意:在META节点进行操作**
1. 更新FoundationDB的配置
vim /etc/foundationdb/foundationdb.conf
# 定位到 [fdbserver] 下的 public-address 配置项,将其修改为 本机IP:$ID
# 如Meta节点Ip为192.168.65.10,则修改为 192.168.65.10:$ID
vim /etc/foundationdb/fdb.cluster
# 将文件中的 127.0.0.1:4500 修改为本机IP:4500
# 如Meta节点Ip为192.168.65.10,则修改为 192.168.65.10:4500
2. 重启FoundationDB服务
systemctl restart foundationdb.service
3. 查看FoundationDB服务端口
ss -tuln
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
# ...
# tcp LISTEN 0 4096 192.168.65.10:4500 0.0.0.0:*
# ...

③ 启动monitor_collector

**注意:在META节点进行操作**、

1. 从编译服务器获取二进制文件以及配置文件,本文编译节点为Meta节点:

mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz meta:/home/3FS/build/bin/monitor_collector_main /opt/3fs/bin/
rsync -avz meta:/home/3FS/configs/monitor_collector_main.toml /opt/3fs/etc/
rsync -avz meta:/home/3FS/deploy/systemd/monitor_collector_main.service /usr/lib/systemd/system

2. 修改配置文件monitor_collector_main.toml:

vim /opt/3fs/etc/monitor_collector_main.toml
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0'] # 查询RDMA网卡名填入
listen_port = 10000
listen_queue_depth = 4096
rdma_listen_ethernet = true
reuse_port = false
...
[server.monitor_collector.reporter.clickhouse]
db = '3fs'
host = '127.0.0.1'
passwd = 'clickhouse123'
port = '9123'
user = 'default'

3. 启动monitor_collector服务:

systemctl start monitor_collector_main

4. 检查monitor_collector状态:

systemctl status monitor_collector_main

5. 检查端口情况:

ss -tuln
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
# ...
# tcp LISTEN 0 4096 192.168.65.10:10000 0.0.0.0:*
# ...

           注意:这里只能有一个192.168.65.10:10000 不能存在 127.0.0.1:10000 避免其他服务连接端口错误。如果存在多个10000端口,请检查上面monitor_collector_main.toml文件中是否填写filter_list配置项,下不赘述

④ 安装Admin Client

**注意:在所有节点进行操作**

1. 从编译服务器获取二进制文件以及配置文件:

     mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz meta:/home/3FS/build/bin/admin_cli /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/admin_cli.toml /opt/3fs/etc
rsync -avz meta:/etc/foundationdb/fdb.cluster /opt/3fs/etc

2. 更新配置文件admin_cli.toml:

     vim /opt/3fs/etc/admin_cli.toml

     ...
cluster_id = "stage"
...
[fdb]
clusterFile = '/opt/3fs/etc/fdb.cluster'

[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...

3. 查看帮助

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml help

⑤ 启动Mgmtd Service

**注意:在META节点进行操作**

1. 从编译服务器获取二进制文件以及配置文件:

     mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz meta:/home/3FS/build/bin/mgmtd_main /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{mgmtd_main.toml,mgmtd_main_launcher.toml,mgmtd_main_app.toml} /opt/3fs/etc
rsync -avz meta:/home/3FS/deploy/systemd/mgmtd_main.service /usr/lib/systemd/system

2. 修改配置文件mgmtd_main_app.toml:

     vim /opt/3fs/etc/mgmtd_main_app.toml

     allow_empty_node_id = true
node_id = 1 # 修改node_id 为1

3. 修改配置文件mgmtd_main_launcher.toml:

     vim /opt/3fs/etc/mgmtd_main_launcher.toml

     ...
cluster_id = "stage"
...
[fdb]
clusterFile = '/opt/3fs/etc/fdb.cluster'
...

4. 修改配置文件mgmtd_main.toml:

     vim /opt/3fs/etc/mgmtd_main.toml

     ...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0'] # 查询RDMA网卡名填入
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0'] # 查询RDMA网卡名填入
...
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000" # monitor_collector节点ip及端口
...

5. 初始化集群

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "init-cluster --mgmtd /opt/3fs/etc/mgmtd_main.toml 1 1048576 16"
# Init filesystem, root directory layout: chain table ChainTableId(1), chunksize 1048576, stripesize 16
#
# Init config for MGMTD version 1

6. 启动服务

     systemctl start mgmtd_main

7. 检查服务状态

     systemctl status mgmtd_main


8. 检查端口

     ss -tuln
# Netid      State       Recv-Q      Send-Q            Local Address:Port            Peer Address:Port     Process
# ...
# tcp        LISTEN      0           4096             192.168.65.10:8000                0.0.0.0:*
# tcp        LISTEN      0           4096             192.168.65.10:9000                0.0.0.0:*
# ...

9. 检查集群Nodes List

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "list-nodes"

# Id     Type     Status               Hostname  Pid      Tags  LastHeartbeatTime    ConfigVersion  ReleaseVersion
# 1      MGMTD    PRIMARY_MGMTD        meta      2281735  []    N/A                  1(UPTODATE)   250228-dev-1-999999-923bdd7c


⑥ 启动Meta Server

**注意:在META节点进行操作**

  1. 从编译服务器获取二进制文件以及配置文件:

     mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz meta:/home/3FS/build/bin/meta_main /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{meta_main_launcher.toml,meta_main.toml,meta_main_app.toml} /opt/3fs/etc
rsync -avz meta:/home/3FS/deploy/systemd/meta_main.service /usr/lib/systemd/system

1. 更新配置文件meta_main_app.toml:

     vim /opt/3fs/etc/meta_main_app.toml

     allow_empty_node_id = true
node_id = 100 # 更新node_id

2. 更新配置文件meta_main_launcher.toml:

     vim /opt/3fs/etc/meta_main_launcher.toml

     ...
cluster_id = "stage"
...
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...

3. 更新配置文件meta_main.toml:

     vim /opt/3fs/etc/meta_main.toml

     ...
[server.mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]

[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000"
...
[server.fdb]
clusterFile = '/opt/3fs/etc/fdb.cluster'
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0']
listen_port = 8001
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0']
listen_port = 9001
...

4. 向Mgmtd Server更新Meta节点配置

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "set-config --type META --file /opt/3fs/etc/meta_main.toml"

5. 启动服务

     systemctl start meta_main

6. 检查服务状态

     systemctl status meta_main


7. 检查端口

     ss -tuln
# Netid      State       Recv-Q      Send-Q            Local Address:Port            Peer Address:Port     Process
# ...
# tcp        LISTEN      0           4096             192.168.65.10:8001                0.0.0.0:*
# tcp        LISTEN      0           4096             192.168.65.10:9001                0.0.0.0:*
# ...

8. 检查集群Nodes List

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "list-nodes"
# Id     Type     Status               Hostname  Pid      Tags  LastHeartbeatTime    ConfigVersion  ReleaseVersion
# 1      MGMTD    PRIMARY_MGMTD        meta      2281735  []    N/A                  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 100    META     HEARTBEAT_CONNECTED  meta      2281842  []    2025-03-12 17:01:32  1(UPTODATE)   250228-dev-1-999999-923bdd7c

4.2 Storage节点

**注意:在Storage节点进行操作**

① SSD盘准备

1. 查看可以用来挂载的硬盘

     lsblk
# NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
# ...
# nvme0n1                   259:0    0  2.9T  0 disk
# nvme1n1                   259:1    0  2.9T  0 disk
# nvme2n1                   259:2    0  2.9T  0 disk
# nvme3n1                   259:3    0  2.9T  0 disk
# nvme4n1                   259:4    0  2.9T  0 disk
# nvme5n1                   259:5    0  2.9T  0 disk
# nvme6n1                   259:6    0  2.9T  0 disk
# nvme7n1                   259:7    0  2.9T  0 disk
# ...

2. 创建目录

     mkdir -p /storage/data{0..7}
mkdir -p /var/log/3fs

3. 格式化硬盘并进行挂载

           注意!我们的环境8块NVMe盘是从0~7连续的,可以直接使用下面的命令,大家根据自己的环境灵活调整命名,不要直接复制使用!

     for i in {0..7};do mkfs.xfs -L data${i} /dev/nvme${i}n1;mount -o noatime,nodiratime -L data${i} /storage/data${i};done
mkdir -p /storage/data{0..7}/3fs

4. 检查格式化及挂载结果

     lsblk
# NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
# ...
# nvme0n1                   259:0    0  2.9T  0 disk /storage/data0
# nvme1n1                   259:1    0  2.9T  0 disk /storage/data1
# nvme2n1                   259:2    0  2.9T  0 disk /storage/data2
# nvme3n1                   259:3    0  2.9T  0 disk /storage/data3
# nvme4n1                   259:4    0  2.9T  0 disk /storage/data4
# nvme5n1                   259:5    0  2.9T  0 disk /storage/data5
# nvme6n1                   259:6    0  2.9T  0 disk /storage/data6
# nvme7n1                   259:7    0  2.9T  0 disk /storage/data7
# ...

② 增加aio请求的最大数

sysctl -w fs.aio-max-nr=67108864

③ 启动Storage Server服务

1. 从编译服务器获取二进制文件以及配置文件:

     mkdir -p /opt/3fs/{bin,etc}
mkdir -p /var/log/3fs
rsync -avz meta:/home/3FS/build/bin/storage_main /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{storage_main_launcher.toml,storage_main.toml,storage_main_app.toml} /opt/3fs/etc
rsync -avz meta:/home/3FS/deploy/systemd/storage_main.service /usr/lib/systemd/system
rsync -avz meta:/usr/lib64/libfdb_c.so /usr/lib64

2. 更新配置文件storage_main_app.toml:

     vim /opt/3fs/etc/storage_main_app.toml

     allow_empty_node_id = true
node_id = 10001           # 更新node_id 注意几个storage节点的node_list不同

3. 更新配置文件storage_main_launcher.toml:

     vim /opt/3fs/etc/storage_main_launcher.toml

     ...
cluster_id = "stage"
...
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...

4. 更新配置文件storage_main.toml:

     vim /opt/3fs/etc/storage_main.toml

     ...
[server.base.groups.listener]
filter_list = ['enp133s0f0np0']
listen_port = 8000
...
[server.base.groups.listener]
filter_list = ['enp133s0f0np0']
listen_port = 9000
...
[server.mgmtd]
mgmtd_server_address = ["RDMA://192.168.65.10:8000"]
...
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000"
...
[server.targets]
target_paths = ["/storage/data0/3fs","/storage/data1/3fs","/storage/data2/3fs","/storage/data3/3fs","/storage/data4/3fs","/storage/data5/3fs","/storage/data6/3fs","/storage/data7/3fs"]
...

5. 向Mgmtd Server更新Storage节点配置

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "set-config --type STORAGE --file /opt/3fs/etc/storage_main.toml"

6. 启动服务

     systemctl start storage_main

7. 检查服务状态

     systemctl status storage_main



8. 检查集群Nodes List

           如果没有看到Storage节点,通常需要等待1~2分钟左右

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "list-nodes"
# Id     Type     Status               Hostname  Pid      Tags  LastHeartbeatTime    ConfigVersion  ReleaseVersion
# 1      MGMTD    PRIMARY_MGMTD        meta      2281735  []    N/A                  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 100    META     HEARTBEAT_CONNECTED  meta      2281842  []    2025-03-12 17:01:32  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 10001  STORAGE  HEARTBEAT_CONNECTED  storage1  3294593  []    2025-03-12 17:38:13  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 10002  STORAGE  HEARTBEAT_CONNECTED  storage2  476286   []    2025-03-12 17:38:12  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 10003  STORAGE  HEARTBEAT_CONNECTED  storage3  2173767  []    2025-03-12 17:38:12  1(UPTODATE)   250228-dev-1-999999-923bdd7c

4.3 创建admin user、storage targets和chain table

**注意:在META节点进行操作**

1. 创建admin user

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "user-add --root --admin 0 root"
# Uid                0
# Name               root
# Token              AAB7sN/h8QBs7/+B2wBQ03Lp(Expired at N/A)
# IsRootUser         true
# IsAdmin            true
# Gid                0

           其中AAB7sN/h8QBs7/+B2wBQ03Lp就是Token,将其保存到/opt/3fs/etc/token.txt中。

2. 安装python依赖

     pip3 install -r /home/3FS/deploy/data_placement/requirements.txt

3. 创建chain Table

     cd /home
python3 /home/3FS/deploy/data_placement/src/model/data_placement.py \
   -ql -relax -type CR --num_nodes 3 --replication_factor 3 --min_targets_per_disk 6

           其中关注以下配置项:

  • --num_nodes:存储节点数量;
  • --replication_factor:副本因子;

           执行成功会在当前目录下生成一个output/DataPlacementModel-v_*文件夹,如/home/output/DataPlacementModel-v_3-b_6-r_6-k_3-λ_3-lb_3-ub_3

4. 创建chainTable

     python3 /home/3FS/deploy/data_placement/src/setup/gen_chain_table.py \
   --chain_table_type CR --node_id_begin 10001 --node_id_end 10003 \
   --num_disks_per_node 8 --num_targets_per_disk 6 \
   --target_id_prefix 1 --chain_id_prefix 9 \
   --incidence_matrix_path /home/output/DataPlacementModel-v_3-b_6-r_6-k_3-λ_3-lb_3-ub_3/incidence_matrix.pickle

           其中关注以下配置项:

  • --node_id_begin:storage节点NodeId开始值;
  • --node_id_end:storage节点NodeId结束值;
  • --num_disks_per_node:每个存储节点挂载了几块硬盘;
  • -num_targets_per_disk:每个挂载的硬盘有几个taget;
  • --incidence_matrix_path:上一步生成文件路径;

           执行成功后,查看output目录下是否产生了一下文件:

     -rw-r--r--  1 root root  2387 Mar  6 11:55 generated_chains.csv
-rw-r--r--  1 root root   488 Mar  6 11:55 generated_chain_table.csv
-rw-r--r--  1 root root 15984 Mar  6 11:55 remove_target_cmd.txt

5. 创建storage target

     /opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.user_info.token $(<"/opt/3fs/etc/token.txt") < /home/output/create_target_cmd.txt

6. 上传chains到mgmtd service

     /opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.user_info.token $(<"/opt/3fs/etc/token.txt") "upload-chains /home/output/generated_chains.csv"

7. 上传chain table到mgmtd service

     /opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.user_info.token $(<"/opt/3fs/etc/token.txt") "upload-chain-table --desc stage 1 /home/output/generated_chain_table.csv"

8. 查看list-chains

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "list-chains"
# ChainId    ReferencedBy  ChainVersion  Status   PreferredOrder  Target                          Target   Target
# 900100001  1             1             SERVING  []              101000300101(SERVING-UPTODATE)  101000200101(SERVING-UPTODATE)  101000100101(SERVING-UPTODATE)
# 900100002  1             1             SERVING  []              101000300102(SERVING-UPTODATE)  101000200102(SERVING-UPTODATE)  101000100102(SERVING-UPTODATE)
# ...

4.4 FUSE Client节点

**注意:在FUSE Client节点进行操作**

1. 从编译服务器获取二进制文件以及配置文件:

     mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz meta:/home/3FS/build/bin/hf3fs_fuse_main /opt/3fs/bin
rsync -avz meta:/home/3FS/build/bin/admin_cli /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{hf3fs_fuse_main_launcher.toml,hf3fs_fuse_main.toml,hf3fs_fuse_main_app.toml} /opt/3fs/etc
rsync -avz meta:/home/3FS/deploy/systemd/hf3fs_fuse_main.service /usr/lib/systemd/system
rsync -avz meta:/opt/3fs/etc/token.txt /opt/3fs/etc
rsync -avz meta:/usr/lib64/libfdb_c.so /usr/lib64

2. 创建挂载点

     mkdir -p /3fs/stage

3. 更新配置文件hf3fs_fuse_main_launcher.toml:

     vim /opt/3fs/etc/hf3fs_fuse_main_launcher.toml

     ...
cluster_id = "stage"
mountpoint = '/3fs/stage'
token_file = '/opt/3fs/etc/token.txt'
...
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...

4. 更新配置文件hf3fs_fuse_main.toml:

     vim /opt/3fs/etc/hf3fs_fuse_main.toml

     ...
[mgmtd]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000"
...

5. 更新配置:

     /opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "set-config --type FUSE --file /opt/3fs/etc/hf3fs_fuse_main.toml"

6. 启动服务

     systemctl start hf3fs_fuse_main

7. 检查服务状态

     systemctl status hf3fs_fuse_main


8. 检查挂载点

     df -h
# Filesystem                         Size  Used Avail Use% Mounted on
# ...
# hf3fs.stage                         70T  650G   70T   1% /3fs/stage

五、测试3FS

使用fio在3个客户端对3FS进行并发读取测试:

yum install fio -y
fio -numjobs=128 -fallocate=none -iodepth=2 -ioengine=libaio -direct=1 \
    -rw=read -bs=4M --group_reporting -size=100M -time_based -runtime=3000 \
    -name=2depth_128file_4M_direct_read_bw -directory=/3fs/stage

4M并发读测试下,每个客户端都能达到10GB/s的带宽。

六、展望——3FS+鲲鹏,AI基建的新范式

此次适配不仅验证了3FS在ARM生态的成熟度,更揭示了AI技术栈的无限可能。

“3FS的高性能与开源属性,为AI时代的数据引擎提供了‘中国方案’。我们期待,这场由3FS引领的存储革命,将加速AI基础设施的全面崛起。

3FS在鲲鹏平台上的成功实践,不仅展现了其在高性能存储领域的潜力,也为AI和大数据场景提供了新的选择。未来,随着技术的不断进步和生态的完善,3FS有望在更多领域中大放异彩,成为存储领域的标杆。

持续关注我们,获取更多3FS在鲲鹏平台的优化实践案例!

本页内容