Deploying the Spark Engine
For details, see Spark Deployment Guide (CentOS 7.6 & openEuler 20.03). When deploying Spark, configure parameters in the spark-defaults.conf file by referring to Table 1.
Parameter |
Recommended Value |
Description |
|---|---|---|
spark.eventLog.enabled |
true |
Specifies whether to enable event logging. After this function is enabled, Spark tasks will generate logs, which will be used as input data for model training. |
spark.eventLog.dir |
hdfs://server1:9000/spark2-history |
Directory for storing event logs of Spark tasks. |
spark.eventLog.compress |
true |
Specifies whether to enable log compression. |
spark.network.timeout |
600s |
Spark network timeout period. During view creation, shuffle read may retry due to timeout, causing data inconsistency. You can set this parameter to a relatively larger value to reduce retries. |