我要评分
获取效率
正确性
完整性
易理解

Deploying the Hive Engine

Planning the Cluster Environment

The environment planned in this section consists of seven servers, including one task submission node, three compute nodes, and three storage nodes. In the big data cluster, the Hive client functions as the task submission node, and the compute nodes are agent1, agent2, and agent3. The storage nodes are ceph1, ceph2, and ceph3. See Figure 1.

Figure 1 Environment configuration

Table 1 lists the hardware environment of the cluster.

Table 1 Hardware configurations

Item

Model

Processor

Kunpeng 920 5220

Memory size

384 GB (12 x 32 GB)

Memory frequency

2666 MHz

NIC

  • Ceph environment: 25GE for the service network and GE for the management network
  • HDFS environment: 10GE for the service network and GE for the management network

Drive

  • System drive: 1 x RAID 0 (1 x 1.2 TB SAS HDD)
  • Management node: 12 x RAID 0 (1 x 4 TB SATA HDD)
  • Compute node:
    • Ceph environment: 1 x 3.2 TB NVMe
    • HDFS environment: 12 x RAID 0 (1 x 4 TB SATA HDD)
  • Storage node:
    • Ceph environment: 12 x RAID 0 (1 x 4 TB SATA HDD) 1 x 3.2 TB NVMe
    • HDFS environment: 12 x RAID 0 (1 x 4 TB SATA HDD)

RAID controller card

LSI SAS3508

Installing the Hive Engine

During the installation, select /opt/hive/boostkit as the software installation directory and place all JAR packages that Hive depends on in this directory, as shown in Table 2.

Table 2 Installation directory

Installation Node

Installation Directory

Component

How to Obtain

Server (server1)

/opt/hive/boostkit

aws-java-sdk-bundle-1.11.375.jar

Download it from the Kunpeng Community.

bcpkix-jdk15on-1.68.jar

Download it from the Kunpeng Community.

bcprov-jdk15on-1.68.jar

Download it from the Kunpeng Community.

boostkit-omnidata-client-1.4.0-aarch64.jar

Download it from the Huawei Support website.

boostkit-omnidata-common-1.4.0-aarch64.jar

Download it from the Huawei Support website.

boostkit-omnidata-hive-exec-3.1.0-1.4.0.jar

Download it from the Kunpeng Community or use the source code for compilation.

guava-31.1-jre.jar

Download it from the Kunpeng Community.

hadoop-aws-3.2.0.jar

Download it from the Kunpeng Community.

kryo-shaded-4.0.2.jar

Download it from the Kunpeng Community.

haf-1.3.0.jar

Download it from the Huawei Support website.

hdfs-ceph-3.2.0.jar

Download it from the Kunpeng Community.

hetu-transport-1.6.1.jar

Download it from the Kunpeng Community.

jackson-annotations-2.13.2.jar

Download it from the Kunpeng Community.

jackson-core-2.13.2.jar

Download it from the Kunpeng Community.

jackson-databind-2.13.2.1.jar

Download it from the Kunpeng Community.

jackson-datatype-guava-2.12.4.jar

Download it from the Kunpeng Community.

jackson-datatype-jdk8-2.12.4.jar

Download it from the Kunpeng Community.

jackson-datatype-joda-2.13.3.jar

Download it from the Kunpeng Community.

jackson-datatype-jsr310-2.12.4.jar

Download it from the Kunpeng Community.

jackson-module-parameter-names-2.12.4.jar

Download it from the Kunpeng Community.

jasypt-1.9.3.jar

Download it from the Kunpeng Community.

jol-core-0.2.jar

Download it from the Kunpeng Community.

joni-2.1.5.3.jar

Download it from the Kunpeng Community.

log-0.193.jar

Download it from the Kunpeng Community.

perfmark-api-0.23.0.jar

Download it from the Kunpeng Community.

presto-main-1.6.1.jar

Download it from the Kunpeng Community.

presto-spi-1.6.1.jar

Download it from the Kunpeng Community.

protobuf-java-3.12.0.jar

Download it from the Kunpeng Community.

slice-0.38.jar

Download it from the Kunpeng Community.

The aws-java-sdk-bundle-1.11.375.jar, hadoop-aws-3.2.0.jar, and hdfs-ceph-3.2.0.jar packages need to be added in the Ceph environment. The HDFS environment does not require these packages. The other packages listed in the previous table can be obtained by running the hive_build.sh script on Gitee:

  1. Create an /opt/hive/boostkit directory.
    1
    mkdir -p /opt/hive/boostkit
    
  2. On the task submission node (server1), upload the boostkit-omnidata-client-1.4.0-aarch64.jar and boostkit-omnidata-common-1.4.0-aarch64.jar files (contained in BoostKit-omnidata_1.4.0.zip\BoostKit-omnidata_1.4.0.tar.gz) obtained from Obtaining Software to the /opt/hive/boostkit directory.
    1
    2
    cp boostkit-omnidata-client-1.4.0-aarch64.jar /opt/hive/boostkit
    cp boostkit-omnidata-common-1.4.0-aarch64.jar /opt/hive/boostkit
    
  3. Upload the haf-1.3.0.jar file (contained in BoostKit-haf_1.3.0.zip\haf-1.3.0.tar.gz\haf-host-1.3.0.tar.gz\lib\jar\) obtained in Obtaining Software to the /opt/hive/boostkit directory.
    1
    cp haf-1.3.0.jar /opt/hive/boostkit
    
  4. Upload the hdfs-ceph-3.2.0.jar file obtained in Obtaining Software and the aws-java-sdk-bundle-1.11.375.jar and hadoop-aws-3.2.0.jar files contained in boostkit-omnidata-server-1.4.0-aarch64-lib.zip to the /opt/hive/boostkit directory. (If an HDFS storage system is used, skip this step.)
    1
    2
    3
    cp aws-java-sdk-bundle-1.11.375.jar /opt/hive/boostkit
    cp hadoop-aws-3.2.0.jar /opt/hive/boostkit
    cp hdfs-ceph-3.2.0.jar /opt/hive/boostkit
    
  5. Use an FTP tool to upload the boostkit-omnidata-hive-exec-3.1.0-1.4.0.zip package to the installation environment and decompress the package.
    1
    unzip boostkit-omnidata-hive-exec-3.1.0-1.4.0.zip
    
  6. Copy the JAR packages in boostkit-omnidata-hive-exec-3.1.0-1.4.0.zip to the /opt/hive/boostkit directory.
    1
    2
    cd boostkit-omnidata-hive-exec-3.1.0-1.4.0
    cp *.jar /opt/hive/boostkit
    

    If you need to manually compile boostkit-omnidata-hive-exec-3.1.0-1.4.0.jar, compile it based on README.md.

  7. Create a tez-ndp directory, obtain the tez.tar.gz package from HDFS (default path: /apps/tez/tez.tar.gz), and decompress it.
    1
    2
    3
    4
    5
    cd /opt/hive/
    mkdir tez-ndp
    cd tez-ndp
    hdfs dfs -get /apps/tez/tez.tar.gz .
    tar -zxvf tez.tar.gz
    
  8. Copy the boostkit directory under /opt/hive/boostkit to the tez-ndp directory, compress it and upload it to HDFS, and delete the original tez.tar.gz package.
    1
    2
    3
    4
    5
    6
    cd /opt/hive/tez-ndp
    cp -r /opt/hive/boostkit .
    rm -rf tez.tar.gz
    tar -zcvf tez.tar.gz *
    hdfs dfs -rmr /apps/tez/tez.tar.gz
    hdfs dfs -put tez.tar.gz /apps/tez/
    
  9. Modify the /usr/local/hive/conf/hive-env.sh file.
    1. Open the file.
      vim /usr/local/hive/conf/hive-env.sh
    2. Press i to enter the insert mode, and add the following content to the end of the file:
      1
      2
      3
      4
      export BOOSTKIT_HOME=/opt/hive/boostkit
      for f in ${BOOSTKIT_HOME}/*.jar; do
        HIVE_CONF_DIR=${HIVE_CONF_DIR}:$f
      done
      

      Press Esc, type :wq!, and press Enter to save the file and exit.

  10. Modify the /usr/local/tez/conf/tez-site.xml file.
    1. Open the file.
      vim /usr/local/tez/conf/tez-site.xml
    2. Press i to enter the insert mode, and add the following content to the end of the file:
       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      <property>
          <name>tez.user.classpath.first</name>
          <value>true</value>
      </property>
      <property>
          <name>tez.cluster.additional.classpath.prefix</name>
          <value>$PWD/tezlib/boostkit/*</value>
      </property>
      <property>
          <name>tez.task.launch.env</name>
          <value>PATH=/home/omm/omnidata-install/haf-host/bin:$PATH,LD_LIBRARY_PATH=/home/omm/omnidata-install/haf-host/lib:$LD_LIBRARY_PATH,CLASS_PATH=/home/omm/omnidata-install/haf-host/lib/jar/haf-1.3.0.jar:$CLASS_PATH,HAF_CONFIG_PATH=/home/omm/omnidata-install/haf-host/etc/</value>
      </property>
      
    3. Press Esc, type :wq!, and press Enter to save the file and exit.