Rate This Document
Findability
Accuracy
Completeness
Readability

Using OmniShuffle

Executing Spark Services

  1. Create an ock_spark.conf file in the /home directory. For details about the parameters in the file, see spark.conf.
    1. Create a file.
      vim /home/ock_spark.conf
    2. Press i to enter the insert mode and add the following content to the file:
      spark.master yarn
      spark.task.cpus 1
      spark.shuffle.compress true
      spark.shuffle.spill.compress true
      spark.rdd.compress true
      spark.executor.extraClassPath     /opt/ock/jars/*
      spark.driver.extraClassPath       /opt/ock/jars/*
      spark.driver.extraJavaOptions -Djava.library.path=/opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars
      spark.executor.extraJavaOptions -Djava.library.path=/opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars
      spark.driver.extraLibraryPath   /opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars:.
      spark.executor.extraLibraryPath /opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars:.
      spark.shuffle.manager              org.apache.spark.shuffle.ock.OCKShuffleManager
      spark.shuffle.ock.manager true
      spark.blacklist.enabled true
      spark.files.fetchFailure.unRegisterOutputOnHost true
      spark.shuffle.service.enabled  false
      spark.blacklist.application.fetchFailure.enabled true
      spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version 2
      spark.driver.maxResultSize 2g
      spark.serializer                        org.apache.spark.serializer.KryoSerializer
      spark.executorEnv.OCK_HOME /opt/ock
      spark.executorEnv.UCX_USE_MT_MUTEX y
      spark.executorEnv.UCX_TLS  tcp
      spark.sql.broadcastTimeout 3000
      spark.sql.ock.autoConfig.enabled true
      spark.sql.ock.autoConfig.history true
      spark.sql.ock.autoConfig.globalRuntimePartition false
      spark.sql.ock.autoConfig.sample false
    3. Press Esc, type :wq!, and press Enter to save the file and exit.
  2. Start the Spark-SQL CLI.

    The following is an example of the native Spark-SQL startup command. You can adjust the values of the configuration items based on the site requirements.

    /usr/local/spark/bin/spark-sql --deploy-mode client --driver-cores 8 --driver-memory 40g --num-executors 30 --executor-cores 6 --executor-memory 35g --master yarn --conf spark.task.cpus=1 --conf spark.default.parallelism=600 --conf spark.sql.broadcastTimeout=500 --conf spark.sql.shuffle.partitions=600 --conf spark.sql.adaptive.enabled=true --database tpcds_bin_partitioned_orc_3
    Start the SparkExtension plugin.
    spark-sql --deploy-mode client --driver-cores 8 \
                                   --driver-memory 40G \
                                   --num-executors 24 \
                                   --executor-cores 12 \
                                   --executor-memory 25g \
                                   --master yarn \
                                   --conf spark.sql.codegen.wholeStage=false \
                                   --jars /opt/ock/jars/* \
                                   --properties-file /home/ock_spark.conf \
                                   --database tpcds_bin_partitioned_orc_3
  3. Check whether OmniShuffle has taken effect.

    If the command output contains "Connected to meta rpc server<192.168.10.150:3892> successfully", OmniShuffle has taken effect.