Installing BoostIO
- Deploy BoostIO in separated, converged, or independent mode. In converged mode, use the upper-layer component user account (for example, juiceadmin:juicegroup) to install the BoostIO server and SDK. In separated and independent modes, create a server user account (for example, bioadmin:biogroup) to install the BoostIO server and use the upper-layer component user account to install the BoostIO SDK. Do not use the root user for installation operations because the root user causes security risks.
- The communication with Ceph and HDFS is configured by users. Use secure communication links to ensure communication security.
Creating a BoostIO Server Running User
- The GID of biogroup is 1000.
- The UID of bioadmin is 9000.
- The password of user bioadmin must meet the following complexity requirements:
- Contain at least eight characters.
- Contain at least three of the following character types:
- Lowercase letters
- Uppercase letters
- Digits
- Special characters: spaces and `~!@#$%^&*()-_=+\|[{}];:'",<.>/?
- The password must be different from the account name.
Run the following commands on the node where BoostIO is installed to create a user account.
- Create user group biogroup.
groupadd -g 1000 biogroup
- Create user bioadmin in user group biogroup.
useradd -g 1000 -d /home/bioadmin -u 9000 -m -s /bin/bash bioadmin
(Optional) Cleaning Up the Environment
- Before the installation, ensure that BoostIO is not installed in the current environment. If BoostIO is installed, clear the environment to prepare for the new installation.
- You are advised to delete unused log files from the SDK client in a timely manner to prevent drive space exhaustion.
- The maximum size of a statistics file is 10 MB on the SDK client and 50 MB on the server. Statistics are collected cyclically. After BoostIO is redeployed and started, a new statistics file is generated. You are advised to clear the old statistics file.
- Collect the IP addresses of the nodes on which you want to install BoostIO.
- Provide at least one NVMe SSD for each node in the cluster and set the SSD owner to the current installation user and user group.
chown [Server_installation_user:Server_installation_user_group] /dev/nvmexnx
- During the initial installation, you need to create the following directories and configure permissions for them:
Table 1 Directories and permissions Directory
User and User Group
Permission
Description
/opt/boostio
Server_installation_user:Server_installation_user_group
750
BoostIO installation directory.
/var/log/boostio
Server_installation_user:Server_installation_user_group
750
BoostIO server log directory.
/var/log/boostio/trace
Server_installation_user:Server_installation_user_group
750
BoostIO statistics log directory.
/home/ip (This IP address is the same as that in the host_ip_list file on each node.)
Server_installation_user:Server_installation_user_group
750
Directory for storing temporary files during BoostIO installation. The files are automatically deleted after the installation is complete.
/var/log/jfs
SDK_installation_user:SDK_installation_user_group
750
BoostIO SDK client log directory.
- Configure the Ceph key ring permission.
- Run the following command on the ZooKeeper server node (ZooKeeper 3.8.1 as an example) to clear the BoostIO cluster information. Add the SO file required by the ZooKeeper client to the {BoostIO_Home}/lib directory, and change the owner to Server_installation_user:Server_installation_user_group and the permissions to 550.
sh /install_path/apache-zookeeper-3.8.1-bin/bin/zkCli.sh >>deleteall /cm
- Clear the BoostIO drive management metadata.
dd bs=8k count=1024 if=/dev/zero of=/dev/nvmexnx
Installing BoostIO
- Log in to the installation node and upload the ubs_io-boostio-1.0.0-1.{OS_version}.aarch64.rpm software package to any available directory.
- Install the software package.
rpm -ivh --nodeps ubs_io-boostio-1.0.0-1.{OS_version}.aarch64.rpmAfter the installation, the following files are generated in the /home directory:
BoostIO_{version}_Linux-{arch}_release.tar.gz and the boostio directory extracted from the TAR package. For details about the directory structure, see Table 2.The tool scripts used to install BoostIO are stored in the scripts directory. See Table 3.
Table 3 Files in the scripts directory Directory
Tool Name
Usage
Execution Method
Parameter
scripts
hand_out_deploy.py
Installation tool script.
hand_out_deploy.py [option]
- install [pkg_path] [user] [group]
- uninstall
host_ip_list
Communication IP address and drive information configuration file.
Used by hand_out_deploy.py.
-
install.sh
Installation execution script.
install.sh [option]
Used by hand_out_deploy.py.
- install [user] [group] [install_path]
- uninstall
scp_file.sh
scp command execution file.
Used by hand_out_deploy.py.
-
ssh_cmd.sh
ssh command execution file.
Used by hand_out_deploy.py.
-
Table 4 Files in the bin directory Directory
File Name
Description
bin
bio_daemon
Executable file of the BoostIO service.
seceasy_encrypt
Executable file of the encryption service.
Table 5 Files in the lib directory Directory
File Name
Description
lib
libbdm.so
Shared object file of BDM, which is used for drive management.
libbio_interceptor_server.so
Shared object file of the bridging service.
libbio_sdk.so
Share object file of the BoostIO SDK client.
libbio_server.so
Shared object file of the BoostIO server.
libhcom.so
Shared object file of HCOM, which is used for network transmission.
libhcom_static.a
Static library file of HCOM.
libhse_cryption.so
Executable file of hseceasy, which is used for encryption.
libock_interceptor.so
Shared object file of the bridging service.
libock_iofwd_proxy.so
Shared object file of the bridging service.
libsecurec.a
Static library file of the encryption service.
libexpire_checker.so
Shared object file of SSL certificate check.
- Configure the installation information.
Set configuration items of the bio.conf file in the conf directory based on your environment and service requirements. See Table 6.
Table 6 BoostIO configuration items Module
Configuration Item
Description
Default Value
Value/Range
Remarks
Log
bio.log.level
Log level.
info
- debug
- info
- warn
- trace
- error
-
Net
bio.net.data.ip_mask
IP address range.
127.0.0.1/24
*.*.*.*/#, where * ranges from 0 to 255 and # ranges from 0 to 32.
When using JuiceFS for big data services, the value of this field must be the same as the IP address corresponding to the host name in the /etc/hosts file.
bio.net.data.listen_port
Network communication port on the service plane.
7201
7201 to 7800
-
bio.net.data.protocol
Network protocol.
tcp
- rdma
- tcp
-
bio.net.rpc.data.busy_polling_mode
Indicates whether to enable busy-polling for Remote Procedure Call (RPC).
false
- true
- false
Available only to RDMA.
bio.net.rpc.data.workers_count
Number of worker cores on the RPC data plane.
4
1 to 16
-
bio.net.request.executor.thread.num
Number of threads for processing requests at the receive end.
8
8 to 256
-
bio.net.request.executor.queue.size
Depth of the request processing queue at the receive end.
1,024
1,024 to 65,535
-
bio.net.ipc.data.busy_polling_mode
Indicates whether to enable busy-polling for inter-process communication (IPC).
false
- true
- false
-
bio.net.ipc.data.workers_count
Number of worker cores on the IPC data plane.
4
1 to 128
-
bio.net.tls.enable.switch
Network security option.
true
- true
- false
- Disabling this option may cause information leakage and spoofing risks.
- If BoostIO is deployed in separated deployment mode, the value of the enableTls parameter transferred by the BoostIO service initialization API must be the same as the value of this configuration item.
bio.net.tls.ca.cert.path
Path to the CA certificate.
/path/CA/cacert.pem
The default value is only an example.
If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
bio.net.tls.ca.crl.path
Path to the certificate revocation list (CRL) file.
-
-
If the security option is enabled and the certificate needs to be checked whether it has been revoked, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
bio.net.tls.server.cert.path
Path to the certificate file on the server.
/path/server/servercert.pem
The default value is only an example.
If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
bio.net.tls.server.key.path
Path to the certificate private key file on the server.
/path/server/serverkey.pem
The default value is only an example.
If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
bio.net.tls.server.key.pass.pathPosix
Path to the private key password of the working certificate.
/path/server/server.keypass
The default value is only an example.
If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
bio.net.hesc.server.tls.kfs.master.path
Path to the root key generated when encrypting the private key of the working certificate.
/path/server/master/kfsa
The default value is only an example.
If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
bio.net.hesc.server.tls.kfs.pass.standby.path
Path to the standby root key generated when encrypting the private key of the working certificate.
/path/server/standby/kfs
The default value is only an example.
If the security option is enabled, the path must be a valid one. If the security function is disabled, the configuration item is not parsed.
Cache
bio.cache.qos.enable
Flow control option.
true
- false
- true
Enabling this option impairs the extreme performance. You are advised to disable this option in performance test cases.
bio.data.crc.enable
Data integrity verification option.
false
- false
- true
Enabling this option increases the data read and write latencies. You are advised to enable this option in fault locating scenarios.
bio.segment.size_in_mb
Cache resource granularity.
4
1 to 16
Unit: MB.
bio.mem.size_in_gb
Memory capacity used as cache resources.
50
0 to 512
- The value cannot exceed the system memory size.
- Unit: GB.
- The value 0 indicates that the node does not provide caching.
bio.disk.path
List of drives used as cache resources.
/dev/sdxx:/dev/sdyy
-
Separate them using colons (:) if there are multiple drive paths. The current version supports a maximum of four drives.
bio.rcache.evict_water_level
Eviction watermark of the read cache.
90
0 to 100
Percentage of the used read cache.
bio.cache.mem_read_write_ratio
Read/write resource ratio of the memory.
5:5
0 to 10:10 to 0
-
bio.cache.disk_read_write_ratio
Read/write resource ratio of the drives.
5:5
0 to 10:10 to 0
-
bio.work.scene
Application scenario flag.
none
- none
- bigdata
Optional. The default value is none.
- none: There is no usage restriction.
- bigdata: Used for big data scenarios. Compared with AI scenarios, the main difference is that I/Os are forcibly aligned in big data scenarios.
bio.work.io.alignsize
I/O alignment data size.
1
1 to 4,194,304
(Optional) The unit is byte.
bio.wcache.evict_water_level
Eviction watermark of the write cache.
0
0 to 100
(Optional) The default value is 0, indicating the percentage of the used write cache.
bio.wcache.negotiate.delay
Eviction negotiation delay.
100
50 to 1,000
Optional. The default value is 100 ms. In scenarios that are sensitive to foreground write performance, set this parameter to a larger value to increase the eviction delay. In other scenarios, set this parameter to a smaller value, which accelerates eviction.
bio.trace.enable
Process statistics collection option.
true
- false
- true
Enabling this option impairs the extreme performance. You are advised to disable this option in performance test cases.
Underfs
bio.underfs.file_system_type
Back-end storage system type.
ceph
- ceph
- hdfs
-
bio.underfs.ceph.cfg.path
Path to the Ceph configuration file.
/etc/ceph/ceph.conf
This parameter cannot be left empty.
Mandatory when ceph is selected. The value must be an existing path.
bio.underfs.ceph.cluster
Ceph cluster name.
ceph
This parameter cannot be left empty.
Mandatory when ceph is selected.
bio.underfs.ceph.user
Ceph user.
client.admin
This parameter cannot be left empty.
Mandatory when ceph is selected.
bio.underfs.ceph.pool
Ceph data pool.
0:jfspool0,1:jfspool1
This parameter cannot be left empty.
Mandatory when ceph is selected. Use commas (,) to separate multiple parameters.
bio.underfs.hdfs.name_node
NameNode of Hadoop.
default:0
*.*.*.*/#, where * ranges from 0 to 255 and # ranges from 0 to 65535.
(Optional) The default value is default:0. The format is IP_address:Port, which indicates the IP address and port specified in the Hadoop configuration file.
bio.underfs.hdfs.working_path
Path for storing files in the HDFS system.
/hdfs
It is a valid path that contains 255 or fewer characters.
Optional. The default value is /hdfs.
CM
bio.cm.initial.nodes_count
Expected number of nodes during cluster initialization.
2
2 to 256
-
bio.cm.copy_num
Data redundancy.
2
2
The current software version supports only dual copies.
bio.cm.pts_count
Number of partitions.
16
2 to 8,192
-
bio.cm.register_timeout_sec
Timeout duration of the ZooKeeper heartbeat check.
20
10 to 60
Unit: s.
bio.cm.register_perm_timeout_sec
Time window for determining permanent faults.
60
60 to 600
Unit: s.
bio.cm.zk_host
ZooKeeper service node information.
Example: 127.0.0.1:2181,127.0.0.2:2181,127.0.0.3:2181 for a ZooKeeper cluster containing three nodes.
-
This parameter cannot be left empty.
The IP address segment used by ZooKeeper must be the same as the service IP address segment.
Prometheus
bio.prometheus.exposer
IP address and port number of the Prometheus server.
-
*.*.*.*:#, where * ranges from 0 to 255 and # ranges from 0 to 65535.
(Optional)
bio.prometheus.scrape_interval_sec
Prometheus sampling frequency.
15
-
(Optional) The unit is second.
- Configure the host_ip_list file.
- Open the host_ip_list file.
vim boostio/scripts/host_ip_list
- Press i to enter the insert mode. Add the following content to the host_ip_list file (replace the variables with the actual ones):
ip1::BoostIO_communication_IP_address_1::Drive_address_1:Drive_address_2 ip2::BoostIO_communication_IP_address_2::Drive_address_1:Drive_address_2
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Open the host_ip_list file.
- Set the user and user group to which the drives belong.
chown [Server_installation_user:Server_installation_user_group] Drive address 1 chown [Server_installation_user:Server_installation_user_group] Drive address 2
- Run the installation script.
python3 hand_out_deploy.py install [Path to the installation package that is not decompressed] [Server installation user] [Server installation user group]
- In the openEuler 20.03 OS, you need to install and configure Python 3 to run the installation script.
- All nodes in the cluster use the same user account and password.
Enter the server installation user name and password as prompted.Figure 1 Command output
After the installation is complete, all BoostIO files and directories are stored in /opt/boostio.