Switching to the UCX Network
Before enabling UCX, you need to configure UCX parameters and configuration files of all nodes.
- Enter a Ceph cluster container.
1cephadm shell - Configure UCX parameters.
1 2 3 4 5 6 7
ceph config set global ms_type async+ucx ceph config set global ms_public_type async+ucx ceph config set global ms_cluster_type async+ucx ceph config set global ms_async_ucx_device mlx5_bond_0:1 ceph config set global ms_async_ucx_tls rc_verbs,self ms_async_rdma_polling_us 6000000 ceph config set global ms_async_ucx_zerocopy true
- Add the following configuration to the MON configuration file on all nodes (ceph1, ceph2, and ceph3):
/var/lib/ceph/fsid/mon*/config ms_type = async+ucx ms_public_type = async+ucx ms_cluster_type = async+ucx ms_async_ucx_device=mlx5_bond_0:1 ms_async_ucx_tls=rc_verbs,self
- You can run the show_gids command to query UCX device names and enter multiple network devices in ms_async_ucx_device.
- In the current environment, the public network uses POSIX, and the cluster network uses UCX. To enable UCX on the front-end network, set ms_public_type to async+ucx.
- The IP addresses of the cluster network and public network must be the same as those of the UCX devices (RoCE/IB interfaces).
- If ms_async_ucx_event_polling is set to true, event polling is enabled.
- Run the following command to synchronize the modifications to the MGRs and OSDs on all nodes:
ls /var/lib/ceph/*/*/config|grep 'osd\|mgr'|xargs -I {} cp -r /var/lib/ceph/*/mon.*/config {} - Modify the service files.
sed -i 's/on-failure/always/g' /etc/systemd/system/ceph-*\@.service sed -i 's/30min/1min/g' /etc/systemd/system/ceph-*\@.service
- Restart the Ceph cluster.
1 2
systemctl daemon-reload systemctl restart ceph.target
- After all containers are started, enter the Ceph cluster container again to check the cluster status.
1 2
cephadm shell ceph -s
Parent topic: RDMA Network Acceleration Feature Guide (Container)