Compiling and Configuring Spark
This section describes how to compile and deploy Spark and configure environment variables to establish an environment for subsequent distributed task processing.
The following uses Spark 3.3.1 as an example to describe the steps for compiling and configuring Spark, which also apply to other versions of Spark. In the following steps, spark-3.3.1-bin-hadoop3.2 is the name of the Spark installation package. Change it according to the actual situation.
- Compile the Spark installation package. For details, see Spark Porting Guide (CentOS & openEuler).
- Upload the Spark installation package to the /usr/local directory on the server1 node and decompress it.
1 2 3
cd /usr/local/ mv spark-3.3.1-bin-hadoop3.2.tgz /usr/local tar -zxvf spark-3.3.1-bin-hadoop3.2.tgz
- Create a soft link for subsequent version updates.
1ln -s spark-3.3.1-bin-hadoop3.2 spark
- Set Spark environment variables.
- Open the /etc/profile file.
1vi /etc/profile - Press i to enter the insert mode and add the following lines to the end of the file:
1 2
export SPARK_HOME=/usr/local/spark export PATH=$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH
- Press Esc, type :wq!, and press Enter to save the file and exit.
- Make the environment variables take effect.
1source /etc/profile
- Open the /etc/profile file.
Parent topic: Deploying Spark