Installing Spark
The following operations must be performed on the management node and all compute nodes.
- Install Spark. For details, see OS and Software Requirements.
- Install the SparkExtension dependency of the openEuler OS.
Configure the local yum source for each OS image and run the following commands to install the dependencies:
yum install lz4-devel.aarch64 -y yum install zstd-devel.aarch64 -y yum install snappy-devel.aarch64 -y yum install protobuf-c-devel.aarch64 protobuf-lite-devel.aarch64 -y yum install boost-devel.aarch64 -y yum install cyrus-sasl-devel.aarch64 -y yum install jsoncpp-devel.aarch64 -y yum install openssl-devel.aarch64 -y yum install libatomic.aarch64 -y
- Configure SparkExtension.
- Obtain the ORC, Protobuf, Arrow, HDFS, and Parquet software installation packages from Obtaining Software, decompress them to obtain the liborc.so, libprotobuf.so.24, libarrow.so.1100, libarrow_dataset.so.1100, libarrow_substrait.so.1100, libhdfs.so, and libparquet.so.1100 files, and upload these files to the /opt/omni-operator/lib directory. Change the file permission to 550.
chmod 550 /opt/omni-operator/lib/lib*
- Decompress boostkit-omniop-spark-3.1.1-1.3.0-aarch64.zip to obtain boostkit-omniop-spark-3.1.1-1.3.0-aarch64.jar, copy boostkit-omniop-spark-3.1.1-1.3.0-aarch64.jar to the /opt/omni-operator/lib directory, and set the permission on the software package to 550.
chmod 550 /opt/omni-operator/lib/boostkit-omniop-spark-3.1.1-1.3.0-aarch64.jar
- Decompress the dependencies.tar.gz package extracted from boostkit-omniop-spark-3.1.1-1.3.0-aarch64.zip and copy the dependencies folder to the /opt/omni-operator/lib directory. Set the permission on the software package to 550.
chmod -R 550 /opt/omni-operator/lib/dependencies/
- Obtain the ORC, Protobuf, Arrow, HDFS, and Parquet software installation packages from Obtaining Software, decompress them to obtain the liborc.so, libprotobuf.so.24, libarrow.so.1100, libarrow_dataset.so.1100, libarrow_substrait.so.1100, libhdfs.so, and libparquet.so.1100 files, and upload these files to the /opt/omni-operator/lib directory. Change the file permission to 550.
- Add the following environment variable to the ~/.bashrc file on all nodes:
export OMNI_HOME=/opt/omni-operator
Parent topic: Using OmniOperator on Spark