OmniShuffle Shuffle加速特性使用
执行Spark引擎业务
- 在“/home”目录下新建“ock_spark.conf”文件,关于ock_spark.conf文件中的参数说明请参考spark.conf。
- 新建文件。
vim /home/ock_spark.conf
- 按“i”进入编辑模式,在文件中添加如下内容。
spark.master yarn spark.task.cpus 1 spark.shuffle.compress true spark.shuffle.spill.compress true spark.rdd.compress true spark.executor.extraClassPath /opt/ock/jars/* spark.driver.extraClassPath /opt/ock/jars/* spark.driver.extraJavaOptions -Djava.library.path=/opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars spark.executor.extraJavaOptions -Djava.library.path=/opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars spark.driver.extraLibraryPath /opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars:. spark.executor.extraLibraryPath /opt/ock/ucache/23.0.0/linux-aarch64/lib/common:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/common/ucx/ucx:/opt/ock/ucache/23.0.0/linux-aarch64/lib/datakit:/opt/ock/ucache/23.0.0/linux-aarch64/lib/mf:/opt/ock/jars:. spark.shuffle.manager org.apache.spark.shuffle.ock.OCKShuffleManager spark.shuffle.ock.manager true spark.blacklist.enabled true spark.files.fetchFailure.unRegisterOutputOnHost true spark.shuffle.service.enabled false spark.blacklist.application.fetchFailure.enabled true spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version 2 spark.driver.maxResultSize 2g spark.serializer org.apache.spark.serializer.KryoSerializer spark.executorEnv.OCK_HOME /opt/ock spark.executorEnv.UCX_USE_MT_MUTEX y spark.executorEnv.UCX_TLS tcp spark.sql.broadcastTimeout 3000 spark.sql.ock.autoConfig.enabled true spark.sql.ock.autoConfig.history true spark.sql.ock.autoConfig.globalRuntimePartition false spark.sql.ock.autoConfig.sample false
- 按“Esc”键,输入:wq!,按“Enter”保存并退出编辑。
- 新建文件。
- 启动Spark-SQL命令行窗口。
原生Spark-SQL启动命令示例如下,可根据实际情况对配置项取值进行调整之类。
/usr/local/spark/bin/spark-sql --deploy-mode client --driver-cores 8 --driver-memory 40g --num-executors 30 --executor-cores 6 --executor-memory 35g --master yarn --conf spark.task.cpus=1 --conf spark.default.parallelism=600 --conf spark.sql.broadcastTimeout=500 --conf spark.sql.shuffle.partitions=600 --conf spark.sql.adaptive.enabled=true --database tpcds_bin_partitioned_orc_3
SparkExtension插件启动命令如下。spark-sql --deploy-mode client --driver-cores 8 \ --driver-memory 40G \ --num-executors 24 \ --executor-cores 12 \ --executor-memory 25g \ --master yarn \ --conf spark.sql.codegen.wholeStage=false \ --jars /opt/ock/jars/* \ --properties-file /home/ock_spark.conf \ --database tpcds_bin_partitioned_orc_3
- 查看OmniShuffle Shuffle加速是否生效。
启动命令执行后出现Connected to meta rpc server<192.168.10.150:3892> successfully则生效。
父主题: 特性使用