当互联网遇上AI存储革命:3FS+鲲鹏,如何打造高性能数据引擎? ——记3FS存储引擎在华为鲲鹏平台的适配与实践
发表于 2025/05/06
0
作者:彭润霖 包旻晨
一、部署环境
1.1 节点信息
| 
    Node  | 
    IP  | 
    Memory  | 
    NVMe SSD  | 
    RDMA  | 
    OS  | 
| 
   meta  | 
   192.168.65.10  | 
   256GB  | 
   -  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
| 
   fuseclient1  | 
   192.168.65.11  | 
   256GB  | 
   -  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
| 
   fuseclient2  | 
   192.168.65.12  | 
   256GB  | 
   -  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
| 
   fuseclient3  | 
   192.168.65.13  | 
   256GB  | 
   -  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
| 
   storage1  | 
   192.168.65.14  | 
   1TB  | 
   3TB*8  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
| 
   storage2  | 
   192.168.65.15  | 
   1TB  | 
   3TB*8  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
| 
   storage3  | 
   192.168.65.16  | 
   1TB  | 
   3TB*8  | 
   100GE ConnectX-6  | 
   OpenEuler 22.03 SP4  | 
1.2 服务信息
| 
    Service  | 
    Binary  | 
    Config files  | 
    NodeID  | 
    Node  | 
| 
   monitor  | 
   monitor_collector_main  | 
   monitor_collector_main.toml  | 
   -  | 
   meta  | 
| 
   admin_cli  | 
   admin_cli  | 
   admin_cli.toml fdb.cluster  | 
   -  | 
   meta storage1 storage2 storage3  | 
| 
   mgmtd  | 
   mgmtd_main  | 
   mgmtd_main_launcher.toml mgmtd_main.toml mgmtd_main_app.toml fdb.cluster  | 
   1  | 
   meta  | 
| 
   meta  | 
   meta_main  | 
   meta_main_launcher.toml meta_main.toml meta_main_app.toml fdb.cluster  | 
   100  | 
   meta  | 
| 
   storage  | 
   storage_main  | 
   storage_main_launcher.toml storage_main.toml storage_main_app.toml  | 
   10001~10003  | 
   storage1 storage2 storage3  | 
| 
   client  | 
   hf3fs_fuse_main  | 
   hf3fs_fuse_main_launcher.toml hf3fs_fuse_main.toml  | 
   -  | 
   meta  | 
二、环境准备
2.1 关闭防火墙与设置seliunx模式
**注意:在所有节点进行操作**
关闭防火墙
systemctl stop firewalld
systemctl disable
firewalld
systemctl status
firewalld

设置selinux为permissive模式
vi /etc/selinux/config
SELINUX=permissive

2.2 设置Hosts
**注意:在所有节点进行操作**
将 meta 对应的 ip 填入每个节点的 /etc/hosts
vi /etc/hosts
192.168.65.10 meta
2.3 集群时间同步
**注意:在所有节点进行操作**
yum install -y ntp ntpdate
在所有节点备份旧配置
cd /etc && mv ntp.conf ntp.conf.bak
**注意:在META节点进行操作**
vi /etc/ntp.conf
restrict 127.0.0.1
restrict ::1
restrict 192.168.65.10
mask 255.255.255.0
server 127.127.1.0
fudge 127.127.1.0 stratum
8
# Hosts on local network are less restricted.
restrict 192.168.65.10
mask 255.255.255.0 nomodify notrap
开启ntpd服务
systemctl start ntpd
systemctl enable ntpd
systemctl status ntpd

**注意:在除META外的节点进行操作**
vi /etc/ntp.conf
server 192.168.65.10
在除 meta的所有节点强制同步server时间。这里需要等待5min,否则会报错(no server suitable for synchronization found)
ntpdate meta
在除 meta的所有节点写入硬件时钟,避免重启后失效
hwclock -w
安装并启动crontab工具。
yum install -y
crontabs
chkconfig crond on
systemctl start crond
crontab -e
每隔10分钟与meta节点同步一次时间
*/10 * * * * /usr/sbin/ntpdate 192.168.65.10
三、编译3FS
3.1 构建编译环境
这里我们选择meta节点作为编译节点
① 安装编译所需的依赖
**注意:在所有节点进行操作**
# for openEuler
yum install cmake
libuv-devel lz4-devel xz-devel double-conversion-devel libdwarf libdwarf-devel
libunwind libunwind-devel libaio-devel gflags-devel glog glog-devel gtest-devel
gmock-devel gperftools-devel gperftools openssl-devel gcc gcc-c++ boost* libatomic autoconf
libevent-devel libibverbs libibverbs-devel cargo numactl-devel lld
gperftools-devel gperftools double-conversion libibverbs rsync gperftools-libs
glibc-devel python3-devel meson vim jemalloc -y
② 安装Rust
**注意:在META节点进行操作**
前往Rust官网下载aarch64架构对应的压缩包,并上传到服务器进行解压。

如果你的服务器可以直接连接Rust官网,可以通过wget命令获取对应的压缩包。
wget https://static.rust-lang.org/dist/rust-1.85.0-aarch64-unknown-linux-gnu.tar.xz
tar -xvf rust-1.85.0-aarch64-unknown-linux-gnu.tar.xz
cd rust-1.85.0-aarch64-unknown-linux-gnu
sh install.sh
③ 安装foundationdb
**注意:在META节点进行操作**前往foundationdb/Github代码仓获取foundationdb安装包,并上传到服务器进行安装。

如果你的服务器可以直接链接GitHub,可以通过wget或者yum命令安装对应的安装包。
# 通过wget下载
wget https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-clients-7.3.63-1.el9.aarch64.rpm
wget https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-server-7.3.63-1.el9.aarch64.rpm
rpm -ivh foundationdb-clients-7.3.63-1.el9.aarch64.rpm
rpm -ivh foundationdb-server-7.3.63-1.el9.aarch64.rpm
# 或者通过yum下载
yum install https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-clients-7.3.63-1.el9.aarch64.rpm -y
yum install https://github.com/apple/foundationdb/releases/download/7.3.63/foundationdb-server-7.3.63-1.el9.aarch64.rpm -y
④ 安装毕昇编译器
**注意:在META节点进行操作**1. 可以前往毕昇编译器官网下载4.2.0版本编译器压缩包,上传到服务器进行解压。

 	或者通过wget命令直接获取对应的压缩包:
 	cd /home
wget https://mirrors.huaweicloud.com/kunpeng/archive/compiler/bisheng_compiler/BiShengCompiler-4.2.0-aarch64-linux.tar.gz
tar -xvf BiShengCompiler-4.2.0-aarch64-linux.tar.gz
2.	临时配置环境变量:
 	export PATH=/home/BiShengCompiler-4.2.0-aarch64-linux/bin:$PATH
3.	查看编译器版本:
 	clang -v
BiSheng Enterprise 4.2.0.B009 clang version 17.0.6 (958fd14d28f0)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/BiShengCompiler-4.2.0-aarch64-linux/bin
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/10.3.1
Selected GCC installation: /usr/lib/gcc/aarch64-linux-gnu/10.3.1
Candidate multilib: .;@m64
Selected multilib: .;@m64
⑤安装libfuse
**注意:在META节点和Client节点进行操作**Release libfuse 3.16.1 · libfuse/libfuse · GitHub
# 下载包
cd /home
wget https://github.com/libfuse/libfuse/releases/download/fuse-3.16.1/fuse-3.16.1.tar.gz
# 解压包
tar vzxf fuse-3.16.1.tar.gz
cd fuse-3.16.1
mkdir build
cd build
yum install -y meson
meson setup ..
ninja
ninja install
3.2 获取源码及编译
**注意:在META节点进行操作**export LD_LIBRARY_PATH=/usr/local/lib64:/usr/local/lib:/usr/local:/usr/lib:/usr/lib64:$LD_LIBRARY_PATH
yum install git -y
cd /home
git clone https://gitee.com/kunpeng_compute/3FS.git
cd 3FS
git checkout origin/openeuler
git submodule update --init --recursive
./patches/apply.sh
cmake -S . -B build -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_EXPORT_COMPILE_COMMANDS=ON
cmake --build build -j
检查编辑结果
cd build/bin
ll

四、部署3FS
4.1 Meta节点
① 安装ClickHouse
**注意:在META节点进行操作**1. 获取安装包并安装
cd /home
curl -k https://clickhouse.com/ | sh
sudo ./clickhouse install
安装完成时会让我们输入密码,假设我们输入的密码为clickhouse123
2. 修改ClickHouse默认端口:
chmod 660 /etc/clickhouse-server/config.xml
vim /etc/clickhouse-server/config.xml
定位到<tcp_port>标签 将端口号修改为9123:
...
<tcp_port>9123</tcp_port>
...
3. 启动clickhouse
clickhouse start
4. 创建Metric table
clickhouse-client --port 9123 --password 'clickhouse123' -n < /home/3FS/deploy/sql/3fs-monitor.sql
② 更新FoundationDB配置
**注意:在META节点进行操作**1. 更新FoundationDB的配置
vim /etc/foundationdb/foundationdb.conf
# 定位到 [fdbserver] 下的 public-address 配置项,将其修改为 本机IP:$ID
# 如Meta节点Ip为192.168.65.10,则修改为 192.168.65.10:$ID
vim /etc/foundationdb/fdb.cluster
# 将文件中的 127.0.0.1:4500 修改为本机IP:4500
# 如Meta节点Ip为192.168.65.10,则修改为 192.168.65.10:4500
2. 重启FoundationDB服务
systemctl restart foundationdb.service
3. 查看FoundationDB服务端口
ss -tuln
# Netid State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
# ...
# tcp LISTEN 0 4096 192.168.65.10:4500 0.0.0.0:*
# ...
③ 启动monitor_collector
**注意:在META节点进行操作**、1. 从编译服务器获取二进制文件以及配置文件,本文编译节点为Meta节点:
mkdir -p /var/log/3fsmkdir -p /opt/3fs/{bin,etc}
rsync -avz meta:/home/3FS/build/bin/monitor_collector_main /opt/3fs/bin/
rsync -avz meta:/home/3FS/configs/monitor_collector_main.toml /opt/3fs/etc/
rsync -avz meta:/home/3FS/deploy/systemd/monitor_collector_main.service /usr/lib/systemd/system
2. 修改配置文件monitor_collector_main.toml:
vim /opt/3fs/etc/monitor_collector_main.toml...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0'] # 查询RDMA网卡名填入
listen_port = 10000
listen_queue_depth = 4096
rdma_listen_ethernet = true
reuse_port = false
...
[server.monitor_collector.reporter.clickhouse]
db = '3fs'
host = '127.0.0.1'
passwd = 'clickhouse123'
port = '9123'
user = 'default'
3. 启动monitor_collector服务:
systemctl start monitor_collector_main4. 检查monitor_collector状态:
systemctl status monitor_collector_main
5. 检查端口情况:
     ss -tuln
# Netid      State       Recv-Q      Send-Q            Local Address:Port            Peer Address:Port     Process
# ...
# tcp        LISTEN      0           4096             192.168.65.10:10000                0.0.0.0:*
# ...
注意:这里只能有一个192.168.65.10:10000 不能存在 127.0.0.1:10000 避免其他服务连接端口错误。如果存在多个10000端口,请检查上面monitor_collector_main.toml文件中是否填写filter_list配置项,下不赘述。
④ 安装Admin Client
**注意:在所有节点进行操作**
1. 从编译服务器获取二进制文件以及配置文件:
     mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz
meta:/home/3FS/build/bin/admin_cli /opt/3fs/bin
rsync -avz
meta:/home/3FS/configs/admin_cli.toml /opt/3fs/etc
rsync -avz
meta:/etc/foundationdb/fdb.cluster /opt/3fs/etc
2. 更新配置文件admin_cli.toml:
vim /opt/3fs/etc/admin_cli.toml
     ...
cluster_id = "stage"
...
[fdb]
clusterFile = '/opt/3fs/etc/fdb.cluster'
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...
3. 查看帮助
/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml help
⑤ 启动Mgmtd Service
**注意:在META节点进行操作**
1. 从编译服务器获取二进制文件以及配置文件:
    
mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz
meta:/home/3FS/build/bin/mgmtd_main /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{mgmtd_main.toml,mgmtd_main_launcher.toml,mgmtd_main_app.toml}
/opt/3fs/etc
rsync -avz
meta:/home/3FS/deploy/systemd/mgmtd_main.service /usr/lib/systemd/system
2. 修改配置文件mgmtd_main_app.toml:
vim /opt/3fs/etc/mgmtd_main_app.toml
    
allow_empty_node_id
= true
node_id = 1 # 修改node_id
为1
3. 修改配置文件mgmtd_main_launcher.toml:
vim /opt/3fs/etc/mgmtd_main_launcher.toml
    
...
cluster_id = "stage"
...
[fdb]
clusterFile = '/opt/3fs/etc/fdb.cluster'
...
4. 修改配置文件mgmtd_main.toml:
vim /opt/3fs/etc/mgmtd_main.toml
    
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0'] # 查询RDMA网卡名填入
...
[server.base.groups.listener] 
filter_list = ['enp1s0f0np0'] # 查询RDMA网卡名填入
...
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000" # monitor_collector节点ip及端口
...
5. 初始化集群
    
/opt/3fs/bin/admin_cli -cfg
/opt/3fs/etc/admin_cli.toml "init-cluster --mgmtd /opt/3fs/etc/mgmtd_main.toml 1 1048576
16"
# Init filesystem, root directory layout: chain table
ChainTableId(1), chunksize 1048576, stripesize 16
#
# Init config for MGMTD version 1
6. 启动服务
systemctl start mgmtd_main
7. 检查服务状态
systemctl status mgmtd_main

8. 检查端口
    
ss -tuln
# Netid     
State       Recv-Q      Send-Q            Local Address:Port            Peer Address:Port     Process
# ...
# tcp       
LISTEN      0           4096             192.168.65.10:8000                0.0.0.0:*
# tcp       
LISTEN      0           4096             192.168.65.10:9000                0.0.0.0:*
# ...
9. 检查集群Nodes List
    
/opt/3fs/bin/admin_cli -cfg
/opt/3fs/etc/admin_cli.toml "list-nodes"
# Id     Type     Status               Hostname  Pid     
Tags  LastHeartbeatTime    ConfigVersion  ReleaseVersion
# 1     
MGMTD    PRIMARY_MGMTD        meta      2281735 
[]    N/A                  1(UPTODATE)   250228-dev-1-999999-923bdd7c
⑥ 启动Meta Server
**注意:在META节点进行操作**
- 从编译服务器获取二进制文件以及配置文件:
 
    
mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz
meta:/home/3FS/build/bin/meta_main /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{meta_main_launcher.toml,meta_main.toml,meta_main_app.toml} /opt/3fs/etc
rsync -avz
meta:/home/3FS/deploy/systemd/meta_main.service /usr/lib/systemd/system
1. 更新配置文件meta_main_app.toml:
vim /opt/3fs/etc/meta_main_app.toml
    
allow_empty_node_id
= true
node_id = 100 # 更新node_id
2. 更新配置文件meta_main_launcher.toml:
vim /opt/3fs/etc/meta_main_launcher.toml
    
...
cluster_id = "stage"
...
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...
3. 更新配置文件meta_main.toml:
vim /opt/3fs/etc/meta_main.toml
    
...
[server.mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000"
...
[server.fdb]
clusterFile = '/opt/3fs/etc/fdb.cluster'
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0']
listen_port = 8001
...
[server.base.groups.listener]
filter_list = ['enp1s0f0np0']
listen_port = 9001
...
4. 向Mgmtd Server更新Meta节点配置
/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "set-config --type META --file /opt/3fs/etc/meta_main.toml"
5. 启动服务
systemctl start meta_main
6. 检查服务状态
systemctl status meta_main

7. 检查端口
    
ss -tuln
# Netid     
State       Recv-Q      Send-Q            Local Address:Port            Peer Address:Port     Process
# ...
# tcp       
LISTEN      0           4096             192.168.65.10:8001                0.0.0.0:*
# tcp       
LISTEN      0           4096             192.168.65.10:9001                0.0.0.0:*
# ...
8. 检查集群Nodes List
    
/opt/3fs/bin/admin_cli -cfg
/opt/3fs/etc/admin_cli.toml "list-nodes"
# Id     Type     Status               Hostname  Pid      Tags 
LastHeartbeatTime   
ConfigVersion  ReleaseVersion
# 1     
MGMTD    PRIMARY_MGMTD        meta      2281735 
[]    N/A                  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 100    META     HEARTBEAT_CONNECTED  meta     
2281842  []    2025-03-12 17:01:32  1(UPTODATE)  
250228-dev-1-999999-923bdd7c
4.2 Storage节点
**注意:在Storage节点进行操作**
① SSD盘准备
1. 查看可以用来挂载的硬盘
     lsblk
# NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
# ...
# nvme0n1                   259:0    0 
2.9T  0 disk
# nvme1n1                   259:1    0 
2.9T  0 disk 
# nvme2n1                   259:2    0 
2.9T  0 disk
# nvme3n1                   259:3    0 
2.9T  0 disk 
# nvme4n1                   259:4    0 
2.9T  0 disk
# nvme5n1                   259:5    0 
2.9T  0 disk
# nvme6n1                   259:6    0 
2.9T  0 disk
# nvme7n1                   259:7    0 
2.9T  0 disk 
# ...
2. 创建目录
     mkdir -p /storage/data{0..7}
mkdir -p /var/log/3fs
3. 格式化硬盘并进行挂载
注意!我们的环境8块NVMe盘是从0~7连续的,可以直接使用下面的命令,大家根据自己的环境灵活调整命名,不要直接复制使用!
     for i in {0..7};do mkfs.xfs -L data${i} /dev/nvme${i}n1;mount -o
noatime,nodiratime -L data${i}
/storage/data${i};done
mkdir -p /storage/data{0..7}/3fs
4. 检查格式化及挂载结果
     lsblk
# NAME                      MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
# ...
# nvme0n1                   259:0    0 
2.9T  0 disk /storage/data0
# nvme1n1                   259:1    0 
2.9T  0 disk /storage/data1
# nvme2n1                   259:2    0 
2.9T  0 disk /storage/data2
# nvme3n1               
   259:3    0 
2.9T  0 disk /storage/data3
# nvme4n1                   259:4    0 
2.9T  0 disk /storage/data4
# nvme5n1                   259:5    0 
2.9T  0 disk /storage/data5
# nvme6n1                   259:6    0 
2.9T  0 disk /storage/data6
# nvme7n1                   259:7    0 
2.9T  0 disk /storage/data7
# ...
② 增加aio请求的最大数
sysctl -w fs.aio-max-nr=67108864
③ 启动Storage Server服务
1. 从编译服务器获取二进制文件以及配置文件:
    
mkdir -p /opt/3fs/{bin,etc}
mkdir -p /var/log/3fs
rsync -avz
meta:/home/3FS/build/bin/storage_main /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{storage_main_launcher.toml,storage_main.toml,storage_main_app.toml}
/opt/3fs/etc
rsync -avz
meta:/home/3FS/deploy/systemd/storage_main.service /usr/lib/systemd/system
rsync -avz
meta:/usr/lib64/libfdb_c.so /usr/lib64
2. 更新配置文件storage_main_app.toml:
vim /opt/3fs/etc/storage_main_app.toml
    
allow_empty_node_id
= true
node_id = 10001           # 更新node_id
注意几个storage节点的node_list不同
3. 更新配置文件storage_main_launcher.toml:
vim /opt/3fs/etc/storage_main_launcher.toml
    
...
cluster_id = "stage"
...
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...
4. 更新配置文件storage_main.toml:
vim /opt/3fs/etc/storage_main.toml
    
...
[server.base.groups.listener]
filter_list = ['enp133s0f0np0']
listen_port = 8000
...
[server.base.groups.listener]
filter_list = ['enp133s0f0np0']
listen_port = 9000
...
[server.mgmtd]
mgmtd_server_address = ["RDMA://192.168.65.10:8000"]
...
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000"
...
[server.targets]
target_paths =
["/storage/data0/3fs","/storage/data1/3fs","/storage/data2/3fs","/storage/data3/3fs","/storage/data4/3fs","/storage/data5/3fs","/storage/data6/3fs","/storage/data7/3fs"]
...
5. 向Mgmtd Server更新Storage节点配置
/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "set-config --type STORAGE --file /opt/3fs/etc/storage_main.toml"
6. 启动服务
systemctl start storage_main
7. 检查服务状态
systemctl status storage_main

8. 检查集群Nodes List
如果没有看到Storage节点,通常需要等待1~2分钟左右
    
/opt/3fs/bin/admin_cli -cfg
/opt/3fs/etc/admin_cli.toml "list-nodes"
# Id     Type     Status               Hostname  Pid     
Tags  LastHeartbeatTime    ConfigVersion  ReleaseVersion
# 1     
MGMTD    PRIMARY_MGMTD        meta      2281735 
[]    N/A                  1(UPTODATE)   250228-dev-1-999999-923bdd7c
# 100    META     HEARTBEAT_CONNECTED  meta     
2281842  []    2025-03-12 17:01:32  1(UPTODATE)  
250228-dev-1-999999-923bdd7c
# 10001 
STORAGE  HEARTBEAT_CONNECTED  storage1 
3294593  []    2025-03-12 17:38:13  1(UPTODATE)  
250228-dev-1-999999-923bdd7c
# 10002 
STORAGE  HEARTBEAT_CONNECTED  storage2 
476286   []    2025-03-12 17:38:12  1(UPTODATE)  
250228-dev-1-999999-923bdd7c
# 10003 
STORAGE  HEARTBEAT_CONNECTED  storage3 
2173767  []    2025-03-12 17:38:12  1(UPTODATE)  
250228-dev-1-999999-923bdd7c
4.3 创建admin user、storage targets和chain table
**注意:在META节点进行操作**
1. 创建admin user
     /opt/3fs/bin/admin_cli -cfg
/opt/3fs/etc/admin_cli.toml "user-add --root --admin 0 root"
# Uid               
0
# Name              
root
# Token              AAB7sN/h8QBs7/+B2wBQ03Lp(Expired at
N/A)
# IsRootUser        
true
# IsAdmin           
true
# Gid               
0
其中AAB7sN/h8QBs7/+B2wBQ03Lp就是Token,将其保存到/opt/3fs/etc/token.txt中。
2. 安装python依赖
pip3 install -r /home/3FS/deploy/data_placement/requirements.txt
3. 创建chain Table
     cd /home
python3
/home/3FS/deploy/data_placement/src/model/data_placement.py \
   -ql -relax -type CR --num_nodes 3 --replication_factor 3 --min_targets_per_disk 6
其中关注以下配置项:
- --num_nodes:存储节点数量;
 - --replication_factor:副本因子;
 
执行成功会在当前目录下生成一个output/DataPlacementModel-v_*文件夹,如/home/output/DataPlacementModel-v_3-b_6-r_6-k_3-λ_3-lb_3-ub_3
4. 创建chainTable
     python3 /home/3FS/deploy/data_placement/src/setup/gen_chain_table.py \
   --chain_table_type CR --node_id_begin 10001 --node_id_end 10003 \
   --num_disks_per_node 8 --num_targets_per_disk 6 \
   --target_id_prefix 1 --chain_id_prefix 9 \
   --incidence_matrix_path
/home/output/DataPlacementModel-v_3-b_6-r_6-k_3-λ_3-lb_3-ub_3/incidence_matrix.pickle
其中关注以下配置项:
- --node_id_begin:storage节点NodeId开始值;
 - --node_id_end:storage节点NodeId结束值;
 - --num_disks_per_node:每个存储节点挂载了几块硬盘;
 - -num_targets_per_disk:每个挂载的硬盘有几个taget;
 - --incidence_matrix_path:上一步生成文件路径;
 
执行成功后,查看output目录下是否产生了一下文件:
     -rw-r--r--  1 root
root  2387 Mar  6 11:55 generated_chains.csv
-rw-r--r--  1 root root  
488 Mar  6 11:55
generated_chain_table.csv
-rw-r--r--  1 root root 15984 Mar  6 11:55 remove_target_cmd.txt
5. 创建storage target
/opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.user_info.token $(<"/opt/3fs/etc/token.txt") < /home/output/create_target_cmd.txt
6. 上传chains到mgmtd service
/opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.user_info.token $(<"/opt/3fs/etc/token.txt") "upload-chains /home/output/generated_chains.csv"
7. 上传chain table到mgmtd service
/opt/3fs/bin/admin_cli --cfg /opt/3fs/etc/admin_cli.toml --config.user_info.token $(<"/opt/3fs/etc/token.txt") "upload-chain-table --desc stage 1 /home/output/generated_chain_table.csv"
8. 查看list-chains
     /opt/3fs/bin/admin_cli -cfg
/opt/3fs/etc/admin_cli.toml "list-chains"
# ChainId   
ReferencedBy  ChainVersion  Status  
PreferredOrder  Target                          Target   Target
# 900100001 
1             1             SERVING  []            
 101000300101(SERVING-UPTODATE)  101000200101(SERVING-UPTODATE)  101000100101(SERVING-UPTODATE)
# 900100002 
1             1             SERVING  []             
101000300102(SERVING-UPTODATE) 
101000200102(SERVING-UPTODATE) 
101000100102(SERVING-UPTODATE)
# ...
4.4 FUSE Client节点
**注意:在FUSE Client节点进行操作**
1. 从编译服务器获取二进制文件以及配置文件:
    
mkdir -p /var/log/3fs
mkdir -p /opt/3fs/{bin,etc}
rsync -avz
meta:/home/3FS/build/bin/hf3fs_fuse_main /opt/3fs/bin
rsync -avz
meta:/home/3FS/build/bin/admin_cli /opt/3fs/bin
rsync -avz meta:/home/3FS/configs/{hf3fs_fuse_main_launcher.toml,hf3fs_fuse_main.toml,hf3fs_fuse_main_app.toml}
/opt/3fs/etc
rsync -avz
meta:/home/3FS/deploy/systemd/hf3fs_fuse_main.service /usr/lib/systemd/system
rsync -avz
meta:/opt/3fs/etc/token.txt /opt/3fs/etc
rsync -avz
meta:/usr/lib64/libfdb_c.so /usr/lib64
2. 创建挂载点
mkdir -p /3fs/stage
3. 更新配置文件hf3fs_fuse_main_launcher.toml:
vim /opt/3fs/etc/hf3fs_fuse_main_launcher.toml
    
...
cluster_id = "stage"
mountpoint = '/3fs/stage'
token_file = '/opt/3fs/etc/token.txt'
...
[mgmtd_client]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...
4. 更新配置文件hf3fs_fuse_main.toml:
vim /opt/3fs/etc/hf3fs_fuse_main.toml
    
...
[mgmtd]
mgmtd_server_addresses = ["RDMA://192.168.65.10:8000"]
...
[common.monitor.reporters.monitor_collector]
remote_ip = "192.168.65.10:10000"
...
5. 更新配置:
/opt/3fs/bin/admin_cli -cfg /opt/3fs/etc/admin_cli.toml "set-config --type FUSE --file /opt/3fs/etc/hf3fs_fuse_main.toml"
6. 启动服务
systemctl start hf3fs_fuse_main
7. 检查服务状态
systemctl status hf3fs_fuse_main

8. 检查挂载点
    
df -h
# Filesystem                         Size  Used Avail Use% Mounted on
# ...
# hf3fs.stage                         70T  650G  
70T   1% /3fs/stage
五、测试3FS
使用fio在3个客户端对3FS进行并发读取测试:
yum install fio -y
fio -numjobs=128 -fallocate=none -iodepth=2 -ioengine=libaio -direct=1 \
    -rw=read -bs=4M --group_reporting -size=100M -time_based -runtime=3000 \
    -name=2depth_128file_4M_direct_read_bw -directory=/3fs/stage
4M并发读测试下,每个客户端都能达到10GB/s的带宽。

六、展望——3FS+鲲鹏,AI基建的新范式
此次适配不仅验证了3FS在ARM生态的成熟度,更揭示了AI技术栈的无限可能。
“3FS的高性能与开源属性,为AI时代的数据引擎提供了‘中国方案’。我们期待,这场由3FS引领的存储革命,将加速AI基础设施的全面崛起。
3FS在鲲鹏平台上的成功实践,不仅展现了其在高性能存储领域的潜力,也为AI和大数据场景提供了新的选择。未来,随着技术的不断进步和生态的完善,3FS有望在更多领域中大放异彩,成为存储领域的标杆。
持续关注我们,获取更多3FS在鲲鹏平台的优化实践案例!


