Rate This Document
Findability
Accuracy
Completeness
Readability

Installing BoostIO

(Optional) Clearing the Environment

Before the installation, ensure that BoostIO is not installed in the current environment. If BoostIO is installed, clear the environment to prepare for the new installation.

  1. Collect the IP addresses of the nodes on which you want to install BoostIO.
  2. Provide at least one NVMe SSD for each node in the cluster and set the SSD owner to the current installation user and user group.
    chown user:usergroup /dev/nvmexnx
  3. Add the installation user to the user group to which Ceph belongs.
    usermod -aG ceph_group user
    • When being started, BoostIO needs to read the key of the Ceph client and requires the corresponding permission. Therefore, add the installation user to the Ceph user group.
    • Replace the example ceph_group with the actual Ceph user group.
  4. Run the following command on the ZooKeeper server node to clear the BoostIO cluster information (using ZooKeeper 3.8.1 as an example):
    sh /install_path/apache-zookeeper-3.8.1-bin/bin/zkCli.sh
    >>deleteall /cm
  5. Clear the BoostIO drive management metadata.
    dd bs=8k count=1024 if=/dev/zero of=/dev/nvmexnx

Installing BoostIO

  1. Log in to the installation node and upload the installation package BoostIO_{version}_linux-{arch}_release.tar.gz to the /opt directory.
  2. Decompress the installation package. Table 1 shows the directory structure of the installation package.
    tar -xzvf BoostIO_{version}_linux-{arch}_release.tar.gz
    Table 1 Directory structure of the installation package

    Directory

    Folder in the Directory

    Description

    BoostIO

    bin

    Executable file.

    conf

    Configuration file.

    lib

    Binary dependency library.

    include

    Header file.

    scripts

    Tool scripts.

    The tool scripts used to install BoostIO are stored in the scripts directory. See Table 2.

    Table 2 Scripts

    Directory

    Tool Name

    Usage

    Execution Method

    Parameter

    scripts

    hand_out_deploy.py

    Installation tool script.

    hand_out_deploy.py [option]

    • install [pkg_path] [user] [group]
    • uninstall
    • start_boostio
    • stop_boostio

    host_ip_list

    Communication IP address and drive information configuration file.

    Used by hand_out_deploy.py.

    -

    install.sh

    Installation execution script.

    install.sh [option]

    Used by hand_out_deploy.py.

    • install [user] [group] [install_path]
    • uninstall

    scp_file.sh

    scp command execution file.

    Used by hand_out_deploy.py.

    -

    ssh_cmd.sh

    ssh command execution file.

    Used by hand_out_deploy.py.

    -

  3. Configure the installation information.

    Set configuration items of the bio.conf file in the conf directory based on your environment and service requirements. See Table 3.

    Table 3 BoostIO configuration items

    Module

    Configuration Item

    Description

    Default Value

    Value/Range

    Remarks

    Log

    bio.log.level

    Log level.

    info

    • debug
    • info
    • warn
    • error

    -

    Net

    bio.net.data.ip_mask

    IP address range.

    127.0.0.1/24

    *.*.*.*/#, where * ranges from 0 to 255 and # ranges from 0 to 32.

    -

    bio.net.data.listen_port

    Network communication listening port on the service plane.

    7201

    7201 to 7800

    -

    bio.net.data.protocol

    Network protocol.

    tcp

    • rdma
    • tcp

    -

    bio.net.rpc.data.busy_polling_mode

    Indicates whether to enable busy-polling for Remote Procedure Call (RPC).

    false

    • true
    • false

    Available only to RDMA.

    bio.net.rpc.data.workers_count

    Number of worker cores on the RPC data plane.

    4

    1 to 16

    -

    bio.net.request.executor.thread.num

    Number of threads for processing requests at the receive end.

    8

    8 to 256

    -

    bio.net.request.executor.queue.size

    Depth of the request processing queue at the receive end.

    1,024

    1,024 to 65,535

    -

    bio.net.ipc.data.busy_polling_mode

    Indicates whether to enable busy-polling for inter-process communication (IPC).

    false

    • true
    • false

    -

    bio.net.ipc.data.workers_count

    Number of worker cores on the IPC data plane.

    4

    1 to 128

    -

    bio.net.tls.enable.switch

    Network security option.

    true

    • true
    • false
    • Disabling this option may cause information leakage and spoofing risks.
    • If BoostIO is deployed in separated deployment mode, the value of the enableTls parameter transferred by the BoostIO service initialization API must be the same as the value of this configuration item.

    bio.net.tls.ca.cert.path

    Path to the CA certificate.

    /path/CA/cacert.pem

    The default value is only an example.

    If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    bio.net.tls.ca.crl.path

    Path to the certificate revocation list (CRL) file.

    -

    -

    If the security option is enabled and the certificate needs to be checked whether it has been revoked, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    bio.net.tls.server.cert.path

    Path to the certificate file on the server.

    /path/server/servercert.pem

    The default value is only an example.

    If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    bio.net.tls.server.key.path

    Path to the certificate private key file on the server.

    /path/server/serverkey.pem

    The default value is only an example.

    If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    bio.net.tls.server.key.pass.path

    Path to the private key password of the working certificate.

    /path/server/server.keypass

    The default value is only an example.

    If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    bio.net.hesc.server.kfs.master.path

    Path to the root key generated when encrypting the private key of the working certificate.

    /path/server/master/kfsa

    The default value is only an example.

    If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    bio.net.hesc.server.kfs.pass.standby.path

    Path to the standby root key generated when encrypting the private key of the working certificate.

    /path/server/standby/kfs

    The default value is only an example.

    If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.

    Cache

    bio.cache.qos.enable

    Flow control option.

    true

    • false
    • ture

    Enabling this option impairs the extreme performance. You are advised to disable this option in performance test cases.

    bio.data.crc.enable

    Data integrity verification option.

    false

    • false
    • ture

    Enabling this option increases the data read and write latencies. You are advised to enable this option in fault locating scenarios.

    bio.segment.size_in_mb

    Cache resource granularity.

    4

    1 to 16

    -

    bio.mem.size_in_gb

    Memory capacity used as cache resources.

    50

    0 to 512

    • The value cannot exceed the system memory size.
    • The value 0 indicates that the node does not provide caching.

    bio.disk.path

    List of drives used as cache resources.

    /dev/sdxx:/dev/sdyy

    -

    Use colons (:) to separate multiple drive paths.

    bio.rcache.evict_water_level

    Eviction watermark of the read cache.

    90

    0 to 100

    -

    bio.cache.mem_read_write_ratio

    Read/write resource ratio of the memory.

    5:5

    0 to 10:10 to 0

    -

    bio.cache.disk_read_write_ratio

    Read/write resource ratio of the drives.

    5:5

    0 to 10:10 to 0

    -

    bio.work.scenein

    Application scenario flag.

    none

    • none
    • bigdata

    Optional. The default value is none.

    • none: There is no usage restriction.
    • bigdata: Used for big data scenarios. Compared with AI scenarios, the main difference is that I/Os are forcibly aligned in big data scenarios.

    bio.wcache.evict_water_level

    Eviction watermark of the write cache.

    0

    0 to 100

    Optional. The default value is 0.

    bio.wcache.negotiate.delay

    Eviction negotiation delay.

    100

    50 to 1,000

    Optional. The default value is 100 ms. In scenarios that are sensitive to foreground write performance, set this parameter to a larger value to increase the eviction delay. In other scenarios, set this parameter to a smaller value, which accelerates eviction.

    bio.trace.enable

    Process statistics collection option.

    true

    • false
    • ture

    Enabling this option impairs the extreme performance. You are advised to disable this option in performance test cases.

    Underfs

    bio.underfs.file_system_type

    Back-end storage system type.

    ceph

    • ceph
    • hdfs

    -

    bio.underfs.ceph.cfg.path

    Path to the Ceph configuration file.

    /etc/ceph/ceph.conf

    This parameter cannot be left empty.

    Mandatory when ceph is selected. The value must be an existing path.

    bio.underfs.ceph.cluster

    Ceph cluster name.

    ceph

    This parameter cannot be left empty.

    Mandatory when ceph is selected.

    bio.underfs.ceph.user

    Ceph user.

    client.admin

    This parameter cannot be left empty.

    Mandatory when ceph is selected.

    bio.underfs.ceph.pool

    Ceph data pool.

    0:jfspool0,1:jfspool1

    This parameter cannot be left empty.

    Mandatory when ceph is selected. Use commas (,) to separate multiple parameters.

    bio.underfs.hdfs.name_node

    NameNode of Hadoop.

    default:0

    *.*.*.*/#, where * ranges from 0 to 255 and # ranges from 0 to 65535.

    (Optional) The default value is default:0. The format is IP_address:Port, which indicates the IP address and port specified in the Hadoop configuration file.

    bio.underfs.hdfs.working_path

    Path for storing files in the HDFS system.

    /hdfs

    It is a valid path that contains 255 or fewer characters.

    Optional. The default value is /hdfs.

    CM

    bio.cm.initial.nodes_count

    Expected number of nodes during cluster initialization.

    2

    2 to 256

    -

    bio.cm.copy_num

    Data redundancy.

    2

    2

    The current software version supports only dual copies.

    bio.cm.pts_count

    Number of partitions.

    16

    2 to 8,192

    -

    bio.cm.register_timeout_sec

    Timeout duration of the ZooKeeper heartbeat check.

    30

    10 to 60

    -

    bio.cm.register_perm_timeout_sec

    Time window for determining permanent faults.

    60

    60 to 600

    -

    bio.cm.zk_host

    ZooKeeper service node information.

    Example: 127.0.0.1:2181,127.0.0.2:2181,127.0.0.3:2181 (3-node ZooKeeper cluster)

    -

    This parameter cannot be left empty.

    The IP address segment used by ZooKeeper must be the same as the service IP address segment.

  4. Configure the host_ip_list file.
    cd boostio/scripts
    vi host_ip_list

    The configuration is as follows:

    ip1::BoostIO_communication_IP_address_1::Drive_address_1:Drive_address_2
    ip2::BoostIO_communication_IP_address_2::Drive_address_1:Drive_address_2
  5. Set the user and user group to which the drives belong.
    chown user:group Drive_address_1
    chown user:group Drive_address_2
  6. Run the installation script.
    python3 hand_out_deploy.py install [Path to the uncompressed installation package] [user] [group]
    • user indicates the installation user and group indicates the installation user group.
    • In the openEuler 20.03 OS, you need to install and configure Python 3 to run the installation script.
    Enter your user name and password as prompted.
    Figure 1 Command output

    After the installation is complete, all BoostIO files and directories are stored in /opt/boostio.