Rate This Document
Findability
Accuracy
Completeness
Readability

Environment Requirements

This document provides guidance based on the CentOS or openEuler OS. Verify your hardware and software compatibility first before starting any tasks.

Hardware

Minimum configuration: any CPU, one DIMM of any capacity, and one drive of any capacity

The configuration depends on the actual application scenario.

OS and Software Requirements

This document applies to CentOS 7.4 to 7.6, and openEuler 20.03 to 22.03.

This document uses CentOS 7.6 as an example to describe how to deploy a Spark cluster.

Cluster Environment Planning

In this section, four hosts are used as a management node and compute/storage nodes 1 to 3 in a cluster. Table 2 lists the data plan of each node.

Table 2 Cluster environment planning

Node Name

IP Address

Number of Drives

JDK

Management node

IPaddress1

System drive: 1 × 4 TB HDD

Data drive: 12 × 4 TB HDD

OpenJDK jdk8u252-b09

Compute/Storage node 1

IPaddress2

Compute/Storage node 2

IPaddress3

Compute/Storage node 3

IPaddress4

Software Planning

Table 3 lists the software plan of each node in the cluster.

Table 3 Software planning

Node Name

Host Name

Service

Management node

server1

NameNode, ResourceManager, and Master

Compute/Storage node 1

agent1

QuorumPeerMain, DataNode, NodeManager, JournalNode, and Worker

Compute/Storage node 2

agent2

QuorumPeerMain, DataNode, NodeManager, JournalNode, and Worker

Compute/Storage node 3

agent3

QuorumPeerMain, DataNode, NodeManager, JournalNode, and Worker