通过OmniOperator 算子加速特性提升算子执行效率,同时使用OmniShuffle Shuffle加速组件特性优化数据交互过程,实现端到端提升引擎查询性能。
使用叠加特性之前请参考《Kunpeng BoostKit 24.0.RC5 大数据OmniRuntime特性指南》完成OmniOperator算子加速特性安装。
使用OmniShuffle Shuffle加速组件叠加OmniOperator 算子加速特性执行Spark引擎业务,需通过启动Spark-SQL命令行窗口来执行。
spark.shuffle.manager org.apache.spark.shuffle.ock.OckColumnarShuffleManager spark.shuffle.ock.mode rss # RSS/ESS 可选 spark.sql.orc.columnarReaderBatchSize 10000 spark.memory.offHeap.enabled true spark.memory.offHeap.size 28g spark.driverEnv.LD_PRELOAD /opt/omni-operator/lib/libjemalloc.so.2 spark.executorEnv.LD_PRELOAD /opt/omni-operator/lib/libjemalloc.so.2 spark.executorEnv.OMNI_CONNECTED_ENGINE Spark spark.executorEnv.OMNI_HOME /opt/omni-operator spark.driverEnv.OMNI_HOME /opt/omni-operator spark.executorEnv.LD_LIBRARY_PATH /opt/omni-operator/lib/:/usr/local/lib/HMPP:$LD_LIBRARY_PATH spark.driverEnv.LD_LIBRARY_PATH /opt/omni-operator/lib/:/usr/local/lib/HMPP:$LD_LIBRARY_PATH spark.sql.extensions com.huawei.boostkit.spark.ColumnarPlugin spark.sql.join.columnar.preferShuffledHashJoin true spark.sql.orc.impl native
spark-sql --deploy-mode client --driver-cores 8 \ --driver-memory 40G \ --num-executors 24 \ --executor-cores 12 \ --executor-memory 25g \ --master yarn \ --conf spark.sql.codegen.wholeStage=false \ --jars /home/ockadmin/opt/ock/jars/* \ --jars /opt/omni-operator/lib/* \ --properties-file /home/ock_spark.conf \ --database tpcds_bin_partitioned_orc_3