我要评分
获取效率
正确性
完整性
易理解

Deploying the Spark Engine

For details, see Spark Deployment Guide (CentOS 7.6 & openEuler 20.03). When deploying Spark, configure parameters in the spark-defaults.conf file by referring to Table 1.

Table 1 Parameters in spark-defaults.conf

Parameter

Recommended Value

Description

spark.eventLog.enabled

true

Specifies whether to enable event logging. After this function is enabled, Spark tasks will generate logs, which will be used as input data for model training.

spark.eventLog.dir

hdfs://server1:9000/spark2-history

Directory for storing event logs of Spark tasks.

spark.eventLog.compress

true

Specifies whether to enable log compression.

spark.network.timeout

600s

Spark network timeout period. During view creation, shuffle read may retry due to timeout, causing data inconsistency. You can set this parameter to a relatively larger value to reduce retries.