我要评分
获取效率
正确性
完整性
易理解

Switching to the UCX Network

Before enabling UCX, add UCX-related configurations and set UCX environment variables in the Ceph configuration file. If the UCX multi-rail function is required, configure the UCX_MAX_RNDV_RAILS and UCX_MAX_EAGER_RAILS options.

  1. Check whether the hardware and driver support RoCE. The Mellanox NIC is used as an example.
    1
    lspci | grep Mellanox
    

    If RoCE is available, the following information is displayed:

  2. Stop the Ceph service on all server nodes.
    1
    systemctl stop ceph.target
    
  3. Modify the Ceph configuration file on all server and client nodes. Add the following information under the global field in /etc/ceph/ceph.conf:
    1
    2
    3
    4
    5
    6
    ms_type = async+ucx
    ms_public_type = async+ucx
    ms_cluster_type = async+ucx
    ms_async_ucx_device=mlx5_0:1,mlx5_1:1
    ms_async_ucx_tls=rc_verbs,self
    ms_async_ucx_max_recv=14
    
    • You can run the show_gids command to query device names and enter multiple network devices in ms_async_ucx_device.
    • To enable UCX on the front-end network, set ms_public_type to async+ucx. To enable UCX only on the back-end network, set both ms_type and ms_public_type to async+posix.
    • The IP addresses of the cluster network and public network must be the same as those of the UCX devices (RoCE interfaces).
  4. Add the following information to the /etc/sysconfig/ceph file to configure the UCX environment variables on all server and client nodes:
    1
    2
    3
    4
    5
    6
    UCX_MODULE_DIR=/lib64/ucx
    UCX_RNDV_THRESH=32k
    UCX_MEM_MMAP_HOOK_MODE=none
    UCX_MAX_RNDV_RAILS=4
    UCX_MAX_EAGER_RAILS=4
    UCX_PROTO_ENABLE=y
    
    • To record UCX logs, add the following configurations:
      1
      2
      UCX_LOG_FILE=/var/log/ceph/ucx_%p.log
      UCX_LOG_LEVEL=DEBUG
      
    • UCX_MEM_MMAP_HOOK_MODE can be set to reloc, bistro, or none. If the TCMalloc huge page is enabled, set it to reloc.
    • If two interfaces need to be used at the same time, enable the UCX multi-rail function and set UCX_MAX_RNDV_RAILS and UCX_MAX_EAGER_RAILS to 2 or larger (value range: 1 to 4). The UCX multi-rail configuration can better balance traffic than the bond mode and achieve higher network bandwidth.
  5. Add the following information to the /etc/security/limits.conf file to change the memory limits on all server and client nodes:
    1
    2
    3
    4
    root soft memlock unlimited
    root hard memlock unlimited
    ceph soft memlock unlimited
    ceph hard memlock unlimited
    
  6. Modify the Ceph configuration files in systemd. Add the following information to the service field in ceph-mds@.service, ceph-mgr@.service, ceph-mon@.service, and ceph-osd@.service in /lib/systemd/system/:
    1
    2
    3
    LimitMEMLOCK=infinity
    LimitCORE=infinity
    PrivateDevices=no
    
  7. Add the following information to the After and Wants fields in ceph-mds@.service, ceph-mgr@.service, ceph-mon@.service, and ceph-osd@.service in /lib/systemd/system/:

  8. In /usr/lib/systemd/system/openibd.service, configure 60s to wait after openibd is started.
    ExecStartPost=/bin/sleep 60

  9. Add the following configurations on all server and client nodes:
    1
    2
    ulimit -l unlimited
    ulimit -n 1048576
    
  10. Before starting UCX, check whether all the UCX-related installation packages are installed.

    Ensure that the four installation packages have been installed. Otherwise, the OSD service may exit.

    rpm -qa | grep ucx

    Expected result:

  11. Update the configuration and start Ceph.
    1
    2
    systemctl daemon-reload
    systemctl start ceph.target
    

    In heavy-load scenarios, only 256 images are supported.