Rate This Document
Findability
Accuracy
Completeness
Readability

Test

HiBench is a benchmarking tool for Hadoop/Spark, which is used to verify the Spark performance in the YARN mode.

The cluster name involved in the operations is specified by the fs.defaultFS parameter in the Hadoop configuration file core-site.xml.

  1. Upload HiBench-HiBench-7.0 to the /opt directory and go to the conf directory.
    1
    cd /opt/HiBench-HiBench-7.0/conf
    
  2. Modify the hadoop.conf file.
    1. Open the file.
      1
      vi hadoop.conf
      
    2. Press i to enter the insert mode. Change the value of hibench.hadoop.home to the location where Hadoop is stored and the value of hibench.hdfs.master to hdfs://cluster_name:port.
      1
      2
      hibench.hadoop.home        /usr/local/hadoop/
      hibench.hdfs.master        hdfs://server1:9000
      
    3. Press Esc, type :wq!, and press Enter to save the file and exit.
  3. Modify the spark.conf file.
    1. Open the file.
      1
      vi spark.conf
      
    2. Press i to enter the insert mode. Change the value of hibench.spark.home to the current location where Spark is stored, the value of hibench.spark.master to yarn, and the value of spark.eventLog.dir to hdfs://cluster_name:port/spark2xJobHistory2x.
      1
      2
      3
      hibench.spark.home         /usr/local/spark
      hibench.spark.master       yarn
      spark.eventLog.dir = hdfs://server1:9000/spark2xJobHistory2x
      
    3. Press Esc, type :wq!, and press Enter to save the file and exit.
  4. Create a spark2xJobHistory2x directory in HDFS and check whether the directory is created successfully.
    1
    2
    hdfs dfs -mkdir /spark2xJobHistory2x
    hdfs dfs -ls /
    

  5. Switch to the HiBench root directory and generate test data.
    1
    2
    cd /opt/HiBench-HiBench-7.0/
    /opt/HiBench-HiBench-7.0/bin/workloads/ml/kmeans/prepare/prepare.sh
    

  6. Run the test script. The following uses the K-means clustering algorithm benchmark test script as an example.
    1
    opt/HiBench-HiBench-7.0/bin/workloads/ml/kmeans/spark/run.sh
    

  7. The application status of the tasks executed in steps 5 and 6 can be viewed on the YARN web page at http://server1:8088.

    Change server1 to the IP address of the node where the server process resides.

  8. Check the test result in the report/hibench.report file.
    1
    cat report/hibench.report